logo
down
shadow

Pandas string search in list of dicts


Pandas string search in list of dicts

Content Index :

Pandas string search in list of dicts
Tag : python , By : Adam May
Date : January 12 2021, 01:40 AM

it helps some times The easiest is convert to string, so test by string represenatation of list of dicts:
df.test.astype(str).str.contains('data1')
df['test'].apply(lambda x: any(y.get('term') == 'data1' for y in x))
df['test'].apply(lambda x: any('data1' in y.values() for y in x))
a = [{'term': 'data1', 'a': "foo", 'b': "bar"},
 {'term': 'data2' ,'a': "foo", 'b': "bar"}]
b = [{'term': 'data4', 'a': "foo", 'b': "bar"},
 {'term': 'data2' ,'a': "foo", 'b': "bar"}]
df = pd.DataFrame({"test": [a, b]})
print (df)
                                                test
0  [{'term': 'data1', 'a': 'foo', 'b': 'bar'}, {'...
1  [{'term': 'data4', 'a': 'foo', 'b': 'bar'}, {'...

print (df.test.astype(str).str.contains('data1'))
0     True
1    False
Name: test, dtype: bool

print (df['test'].apply(lambda x: any(y.get('term') == 'data1' for y in x)))
0     True
1    False
Name: test, dtype: bool

print (df['test'].apply(lambda x: any('data1' in y.values() for y in x)))
0     True
1    False
Name: test, dtype: bool

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

Creating a list of dicts from Pandas df


Tag : development , By : Bimal Poudel
Date : March 29 2020, 07:55 AM
With these it helps You can use iterrows for that. This lets you iterate over the rows as Series, not dicts, but that is pretty similar (e.g. has iteritems(), __getitem__, etc).
If you must have use dicts, you can easily convert each Series to dict, using the to_dict() method.
list_of_dicts = list( row.to_dict() for key, row in df.iterrows() )

Converting a string with list of dicts to pandas df


Tag : python , By : littlefuzz
Date : March 29 2020, 07:55 AM
it should still fix some issue I am reading a string from a file which looks like: , try this:
In [73]: pd.DataFrame.from_dict([{'key1':'val1','key2':'val2'},{'key1':'val1','key2':'val2'}])
Out[73]:
   key1  key2
0  val1  val2
1  val1  val2
import json

In [81]: s
Out[81]: '[{"key1":"val1","key2":"val2"},{"key1":"val1","key2":"val2"}]'

In [82]: pd.DataFrame.from_dict(json.loads(s))
Out[82]:
   key1  key2
0  val1  val2
1  val1  val2

Search a list of strings in pandas dataframe and add each search string to a new column


Tag : pandas , By : user176445
Date : March 29 2020, 07:55 AM
around this issue I think you need findall:
With sample data of @AndreyF:
search = ['FR-001', 'FR-002', 'FR-003', 'FR-004']
df['FR'] = df['Description'].str.findall('(' + '|'.join(search) + ')')
print (df)

                            Description                FR
0  AasfasfFR-001,asfasdfafsagsdg FR-002  [FR-001, FR-002]
1                 AasfasfFR-004, FR-002  [FR-004, FR-002]
2         AasfasfFR-02,asfasdfafsagsdg                 []
3  AasfasfFR-001,asfasdfafsagsdg FR-003  [FR-001, FR-003]
4  AasfasfFR-004,asfasdfafsagsdg FR-002  [FR-004, FR-002]
df = df[df['FR'].astype(bool)]
print (df)

                            Description                FR
0  AasfasfFR-001,asfasdfafsagsdg FR-002  [FR-001, FR-002]
1                 AasfasfFR-004, FR-002  [FR-004, FR-002]
3  AasfasfFR-001,asfasdfafsagsdg FR-003  [FR-001, FR-003]
4  AasfasfFR-004,asfasdfafsagsdg FR-002  [FR-004, FR-002]

Pandas - Create df from list of dicts


Tag : python , By : user134570
Date : March 29 2020, 07:55 AM
it helps some times I have data in the following format (list of dicts that each contain a list of 3 lists):
data=[{40258: [['2018-07-03T14:13:41'], ['Open'], ['Closed']]},
 {40257: [['2018-07-03T13:47:55',
     '2018-07-03T14:21:52',
     '2018-07-04T11:56:44'],
    ['Open', 'In Progress', 'Waiting on 3rd Party'],
    ['In Progress', 'Waiting on 3rd Party', 'In Progress']]},
  {40255: [['2018-07-03T13:12:58'], ['Open'], ['Closed']]},
  {40250: [[], [], []]}]

f = lambda x: x + [np.nan]*(3-len(x))
mod_data = [ [k]+ sum(list(map(f, v)), []) for d in data for k,v in d.items()]

cols = ['key', 'List1-1', 'List1-2', 'List1-3', 'List2-1', 'List2-2', 'List2-3', 'List3-1', 'List3-2', 'List3-3']
df = pd.DataFrame(mod_data, columns=cols).set_index('key')
print(df)
                   List1-1              List1-2              List1-3 List2-1      List2-2               List2-3      List3-1               List3-2      List3-3
key                                                                                                                                                            
40258  2018-07-03T14:13:41                  NaN                  NaN    Open          NaN                   NaN       Closed                   NaN          NaN
40257  2018-07-03T13:47:55  2018-07-03T14:21:52  2018-07-04T11:56:44    Open  In Progress  Waiting on 3rd Party  In Progress  Waiting on 3rd Party  In Progress
40255  2018-07-03T13:12:58                  NaN                  NaN    Open          NaN                   NaN       Closed                   NaN          NaN
40250                  NaN                  NaN                  NaN     NaN          NaN                   NaN          NaN                   NaN          NaN

Python / Pandas - put a list of dicts into a Pandas DataFrame - Dict Keys schould be the columns


Tag : python , By : tjh0001
Date : March 29 2020, 07:55 AM
around this issue The pd.DataFrame constructor accepts a list of dictionaries directly. This will be more efficient than appending repeatedly to an existing dataframe. Here's a demo:
d1 = {'name': 'Demetrius', 'number': '0001',
      'style': 'D', 'text': 'Demetrius an der...',
      'year': '1797'}

d2 = {'name': 'ABC', 'number': '0002',
      'style': 'E', 'text': 'Some text',
      'year': '1850'}

L = [d1, d2]

df = pd.DataFrame(L)

print(df)

        name number style                 text  year
0  Demetrius   0001     D  Demetrius an der...  1797
1        ABC   0002     E            Some text  1850
Related Posts Related QUESTIONS :
  • Get size of a file before downloading in Python
  • Python, Unicode, and the Windows console
  • Convert Bytes to Floating Point Numbers in Python
  • Does anyone have experience creating a shared library in MATLAB?
  • Calling a function of a module by using its name (a string)
  • How can I create a directly-executable cross-platform GUI app using Python?
  • Tuning the hyperparameter with gridsearch results in overfitting
  • some coordinates that I extracted from geocoder in Python are not saving in the variable I created
  • 7C in cs circles- python Im not sure what is wrong with this yet
  • How to fix 'AttributeError: 'list' object has no attribute 'shape'' error in python with Tensorflow / Keras when loading
  • python - thread`s target is a method of an object
  • Retrieve Variable From Class
  • What is the reason for matplotlib for printing labels multiple times?
  • Why would people use ThreadPoolExecutor instead of direct function call?
  • When clear_widgets is called, it doesnt remove screens in ScreenManager
  • Python can't import function
  • Pieces doesn't stack after one loop on my connect4
  • How to change font size of all .docx document with python-docx
  • How to store a word with # in .cfg file
  • How to append dictionaries to a dictionary?
  • How can I scrape text within paragraph tag with some other tags then within the paragraph text?
  • Custom entity ruler with SpaCy did not return a match
  • Logging with two handlers - one to file and one to stderr
  • How to do pivot_table in dask with aggfunc 'min'?
  • This for loop displays only the last entry of the student record
  • How to split a string by a specific pattern in number of characters?
  • Python 3: how to scrape research results from a website using CSFR?
  • Setting the scoring parameter of RandomizedSeachCV to r2
  • How to send alert or message from view.py to template?
  • How to add qml ScatterSeries to existing qml defined ChartView?
  • Django + tox: Apps aren't loaded yet
  • My css and images arent showing in django
  • Probability mass function sum 2 dice roll?
  • Cannot call ubuntu 'ulimit' from python subprocess without using shell option
  • Dataframe Timestamp Filter for new/repeating value
  • Problem with clicking select2 dropdownlist in selenium
  • pandas dataframe masks to write values into new column
  • How to click on item in navigation bar on top of page using selenium python?
  • Add multiple EntityRuler with spaCy (ValueError: 'entity_ruler' already exists in pipeline)
  • error when replacing missing ')' using negative look ahead regex in python
  • Is there a way to remove specific strings from indexes using a for loop?
  • select multiple tags by position in beautifulSoup
  • pytest: getting AttributeError: 'CaptureFixture' object has no attribute 'readouterror' capturing stdout
  • Shipping PyGObject/GTK+ app on Windows with MingW
  • Python script to deduplicate lines in multiple files
  • How to prevent window and widgets in a pyqt5 application from changing size when the visibility of one widget is altered
  • How to draw stacked bar plot from df.groupby('feature')['label'].value_counts()
  • Python subprocess doesn't work without sleep
  • How can I adjust 'the time' in python with module Re
  • Join original np array with resulting np array in a form of dictionary? multidimensional array? etc?
  • Forcing labels on histograms in each individual graph in a figure
  • For an infinite dataset, is the data used in each epoch the same?
  • Is there a more efficent way to extend a string?
  • How to calculate each single element of a numpy array based on conditions
  • How do I change the width of Jupyter notebook's cell's left part?
  • Measure distance between lat/lon coordinates and utm coordinates
  • Installing megam for NLTK on Windows
  • filter dataframe on each value of a samn column have a specific value of another column in Panda\Python
  • Threading with pubsub throwing AssertionError: 'callableObj is not callable' in wxPython
  • Get grouped data from 2 dataframes with condition
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com