logo
down
shadow

Pattern table to Pandas DataFrame


Pattern table to Pandas DataFrame

Content Index :

Pattern table to Pandas DataFrame
Tag : python , By : negonicrac
Date : November 27 2020, 03:01 PM

this one helps. Taking source of table function, I come out with this
from pattern.en import parse
from pattern.text.tree import WORD, POS, CHUNK, PNP, REL, ANCHOR, LEMMA, IOB, ROLE, MBSP, Text
import pandas as pd

def sentence2df(sentence, placeholder="-"):
    tags  = [WORD, POS, IOB, CHUNK, ROLE, REL, PNP, ANCHOR, LEMMA]
    tags += [tag for tag in sentence.token if tag not in tags]
    def format(token, tag):
        # Returns the token tag as a string.
        if   tag == WORD   : s = token.string
        elif tag == POS    : s = token.type
        elif tag == IOB    : s = token.chunk and (token.index == token.chunk.start and "B" or "I")
        elif tag == CHUNK  : s = token.chunk and token.chunk.type
        elif tag == ROLE   : s = token.chunk and token.chunk.role
        elif tag == REL    : s = token.chunk and token.chunk.relation and str(token.chunk.relation)
        elif tag == PNP    : s = token.chunk and token.chunk.pnp and token.chunk.pnp.type
        elif tag == ANCHOR : s = token.chunk and token.chunk.anchor_id
        elif tag == LEMMA  : s = token.lemma
        else               : s = token.custom_tags.get(tag)
        return s or placeholder

    columns = [[format(token, tag) for token in sentence] for tag in tags]
    columns[3] = [columns[3][i]+(iob == "I" and " ^" or "") for i, iob in enumerate(columns[2])]
    del columns[2]
    header = ['word', 'tag', 'chunk', 'role', 'id', 'pnp', 'anchor', 'lemma']+tags[9:]

    if not MBSP:
        del columns[6]
        del header[6]

    return pd.DataFrame(
        [[x[i] for x in columns] for i in range(len(columns[0]))],
        columns=header,
    )
>>> string = parse('I want to go to the Restaurant as I am hungry very much')
>>> sentence = Text(string, token=[WORD, POS, CHUNK, PNP])[0]
>>> df = sentence2df(sentence)
>>> print(df)
          word  tag   chunk role id  pnp lemma
0            I  PRP      NP    -  -    -     -
1         want  VBP      VP    -  -    -     -
2           to   TO    VP ^    -  -    -     -
3           go   VB    VP ^    -  -    -     -
4           to   TO       -    -  -    -     -
5          the   DT      NP    -  -    -     -
6   Restaurant  NNP    NP ^    -  -    -     -
7           as   IN      PP    -  -  PNP     -
8            I  PRP      NP    -  -  PNP     -
9           am  VBP      VP    -  -    -     -
10      hungry   JJ    ADJP    -  -    -     -
11        very   RB  ADJP ^    -  -    -     -
12        much   JJ  ADJP ^    -  -    -     -

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

Get HTML table into pandas Dataframe, not list of dataframe objects


Tag : python , By : user186012
Date : March 29 2020, 07:55 AM
this will help From http://pandas.pydata.org/pandas-docs/version/0.17.1/io.html#io-read-html, "read_html returns a list of DataFrame objects, even if there is only a single table contained in the HTML content".
So df = df[0].dropna(axis=0, thresh=4) should do what you want.

Extract Pattern in Pandas Dataframe


Tag : python , By : wiznick
Date : March 29 2020, 07:55 AM
I wish this helpful for you I am extracting a pattern from the column of the dataframe. Some has the Word 'Oscar' and some has the Word 'Oscars'. How to extract in the panda dataframe . Below is the extract line code. This gives error. , Is this what is needed?
import pandas as pd
df = pd.DataFrame({'a': [1,2,3,4], 'b': ['is Oscar','asd','Oscars','not an Oscars q']})

df['c'] = ['Won 3 Oscars. Another 234 wins & 312 nominations.',
'Won 7 Oscars. Another 215 wins & 169 nominations.',
'Won 11 Oscar. Another 174 wins & 113 nominations.',
'Won 4 Oscars. Another 122 wins & 213 nominations.']
df['c'].str.extract('Won (\d+) Oscar[s]?', expand=True).fillna(0)
    0
0   3
1   7
2  11
3   4

replace pandas DataFrame value pattern


Tag : python , By : yarry
Date : March 29 2020, 07:55 AM
With these it helps I think you need str.strip:
df['order_number'] = df['order_number'].str.strip('"').astype(float)
df['order_number'] = df['order_number'].replace('"','', regex=True).astype(float)

Python convert multi-column pandas dataframe to single value table dataframe


Tag : python , By : David
Date : March 29 2020, 07:55 AM
it helps some times I am looking to convert the following multi-level column pandas dataframe to single value table. , Like @Wen stated use melt:
df.rename_axis('Index').reset_index().melt('Index', value_name='Score')
         Index Name Paper  Score
0   2018-01-01    m     1     13
1   2018-06-01    m     1     11
2   2018-01-01    m     2     33
3   2018-06-01    m     2     43
4   2018-01-01    m     3     15
5   2018-06-01    m     3     30
6   2018-01-01    r     1     31
7   2018-06-01    r     1     36
8   2018-01-01    r     2     25
9   2018-06-01    r     2     23
10  2018-01-01    r     3     33
11  2018-06-01    r     3     37

Pandas Dataframe - Mysql select from table where condition in <A column from Dataframe>


Tag : mysql , By : Ari
Date : March 29 2020, 07:55 AM
Related Posts Related QUESTIONS :
  • How do I get certain Time Range in Python
  • python doubly linked list - insertAfter node
  • Open .h5 file in Python
  • Joining a directory name with a binary file name
  • python, sort list with two arguments in compare function
  • Is it possible to print from Python using non-ANSI colors?
  • Pandas concat historical data using date minus some number of days
  • CV2: Import Error in Python OpenCV
  • Is it possible to do this loop in a one-liner?
  • invalid literal for int() with base 10: - django
  • Why does my code print a value that I have not assigned as yet?
  • the collatz func in automate boring stuff with python
  • How to find all possible combinations of parameters and funtions
  • about backpropagation deep neural network in tensorflow
  • Sort strings in pandas
  • How do access my flask app hosted in docker?
  • Replace the sentence include some text with Python regex
  • Counting the most common element in a 2D List in Python
  • logout a user from the system using a function in python
  • mp4 metadata not found but exists
  • Django: QuerySet with ExpressionWrapper
  • Pandas string search in list of dicts
  • Decryption from RSA encrypted string from sqlite is not the same
  • need of maximum value in int
  • a list of several tuples, how to extract the same of the first two elements in the small tuple in the large tuple
  • Display image of 2D Sinewaves in 3D
  • how to prevent a for loop from overwriting a dictionary?
  • How To Fix: RuntimeError: size mismatch in pyTorch
  • Concatenating two Pandas DataFrames while maintaining index order
  • Why does this not run into an infinite loop?
  • Python Multithreading no current event loop
  • Element Tree - Seaching for specific element value without looping
  • Ignore Nulls in pandas map dictionary
  • How do I get scrap data from web pages using beautifulsoup in python
  • Variable used, golobal or local?
  • I have a regex statement to pull all numbers out of a text file, but it only finds 77 out of the 81 numbers in the file
  • How do I create a dataframe of jobs and companies that includes hyperlinks?
  • Detect if user has clicked the 'maximized' button
  • Does flask_login automatically set the "next" argument?
  • Indents in python 3
  • How to create a pool of threads
  • Pandas giving IndexError on one dataframe but not on another similar dataframe
  • Django Rest Framework - Testing client.login doesn't login user, ret anonymous user
  • Running dag without dag file in airflow
  • Filling across a specified dimension of a numpy array
  • Python populating dataframe in pandas from text files
  • How to interpolate a single ("non-piecewise") cubic spline from a set of data points?
  • Divide 2 integers (leetcode 29) - recursion issue
  • Can someone explain why do I get this output in Python?
  • How do I scrape pdf and html from search results without obvious url
  • Is there a way to automatically make a "collage" of plots with matplotlib?
  • How to combine multiple rows in pandas with shared column values
  • How do I get LOAD_CLASSDEREF instruction after dis.dis?
  • Django - How to add items to Bootstrap dropdown?
  • Linear Regression - Does the below implementation of ridge regression finding coefficient term using gradient method is
  • How to drop all rows in pandas dataframe with negative values?
  • Most Efficient Way to Find Closest Date Between 2 Dataframes
  • Execution error when Passing arguments to a python script using os.system. The script takes sys.argv arguments
  • Looping through a function
  • Create a plot for each unique ID
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com