How to extract dates and all of the data following them using re.findall in python
Tag : python , By : francisco santos
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , I have a string, and I'm hoping to use regular expressions to separate some of the information from the string. , Try using the regex: (\d{8}\s)10\.3\s-84\.8\$(.*?)(?=\d{8}|$)
|
Tag : python , By : CSCI GOIN KILL ME
Date : March 29 2020, 07:55 AM
wish helps you You shouldn't call you string "str," because that's a built-in function. But here's an option for you: # Find all of the entries
x = re.findall('(?<![AB]:)(?<=:).*?(?=[,}])', s)
['"mb"', '9', '"John"', '"/mb9/"', '0', '83498', '"mb"', '92', '"Mary"',
'"/mb92/"', '0', '404', '"mb"', '97', '"Dan"', '"/mb97/"', '0', '139',
'"mb"', '268', '"Jennifer"', '"/mb268/"', '0', '0', '"mb"', '289', '"Mike"',
'"/mb289/"', '0', '0', '"mb"', '157', '"Sue"', '"/mb157/"', '0', '35200',
'"mb"', '3', '"Rob"', '"/mb3/"', '0', '103047', '"mb"', '2', '"Tracy"',
'"/mb2/"', '0', '87946', '"mb"', '26', '"Jenny"', '"/mb26/"', '0', '74870',
'"mb"', '5', '"Florence"', '"/mb5/"', '0', '37261', '"mb"', '127', '"Peter"',
'"/mb127/"', '0', '63711', '"mb"', '15', '"Grace"', '"/mb15/"', '0', '63243',
'"mb"', '82', '"Tony"', '"/mb82/"', '0', '6471', '"mb"', '236', '"Lisa"',
'"/mb236/"', '0', '4883']
# Break up into each section
y = []
for i in range(0, len(x), 6):
y.append(x[i:i+6])
[['"mb"', '9', '"John"', '"/mb9/"', '0', '83498']
['"mb"', '92', '"Mary"', '"/mb92/"', '0', '404']
['"mb"', '97', '"Dan"', '"/mb97/"', '0', '139']
['"mb"', '268', '"Jennifer"', '"/mb268/"', '0', '0']
['"mb"', '289', '"Mike"', '"/mb289/"', '0', '0']
['"mb"', '157', '"Sue"', '"/mb157/"', '0', '35200']
['"mb"', '3', '"Rob"', '"/mb3/"', '0', '103047']
['"mb"', '2', '"Tracy"', '"/mb2/"', '0', '87946']
['"mb"', '26', '"Jenny"', '"/mb26/"', '0', '74870']
['"mb"', '5', '"Florence"', '"/mb5/"', '0', '37261']
['"mb"', '127', '"Peter"', '"/mb127/"', '0', '63711']
['"mb"', '15', '"Grace"', '"/mb15/"', '0', '63243']
['"mb"', '82', '"Tony"', '"/mb82/"', '0', '6471']
['"mb"', '236', '"Lisa"', '"/mb236/"', '0', '4883']]
# Name is 3rd value in each list and url is 4th
for i in y:
name = i[2]
url = i[3]
|
Tag : python , By : terrestrial
Date : March 29 2020, 07:55 AM
Hope this helps The trouble is that the .+ is slurping up the first comma, you should change it to .+?, or better yet, [^,]+
|
Date : March 29 2020, 07:55 AM
this will help I have had a longer break from Python and now I need your help again :) import re
a = ['>lcl|NC_003078.1_gene_1 [gene=lacE] [locus_tag=SM_b21652] [location=1..1275]\n','>lcl|NC_003078.1_gene_2 [gene=lacF] [locus_tag=SM_b21653] [location=complement(22345..23337)]\n']
for i in a:
val = re.findall("location\=.*?]", i)[0] #Find Location.
val = re.findall("\d+", val) #Find start and end.
print("Start: {0} End: {1}".format(val[0], val[1]))
Start: 1 End: 1275
Start: 22345 End: 23337
|
Date : March 29 2020, 07:55 AM
|