How can I deal with accented letters, german letters and other characters?
Tag : python , By : tanminivan
Date : March 29 2020, 07:55 AM
wish helps you Don't parse http://translate.google.com/translate_t since Google provides an AJAX service for this purpose. The translatedText in the json data returned by ajax.googleapis.com is already a unicode string. import urllib2
import urllib
import sys
import json
LANG={
"arabic":"ar", "bulgarian":"bg", "chinese":"zh-CN",
"croatian":"hr", "czech":"cs", "danish":"da", "dutch":"nl",
"english":"en", "finnish":"fi", "french":"fr", "german":"de",
"greek":"el", "hindi":"hi", "italian":"it", "japanese":"ja",
"korean":"ko", "norwegian":"no", "polish":"pl", "portugese":"pt",
"romanian":"ro", "russian":"ru", "spanish":"es", "swedish":"sv" }
def translate(text,lang1,lang2):
base_url='http://ajax.googleapis.com/ajax/services/language/translate?'
langpair='%s|%s'%(LANG.get(lang1.lower(),lang1),
LANG.get(lang2.lower(),lang2))
params=urllib.urlencode( (('v',1.0),
('q',text.encode('utf-8')),
('langpair',langpair),) )
url=base_url+params
content=urllib2.urlopen(url).read()
try: trans_dict=json.loads(content)
except AttributeError:
try: trans_dict=json.load(content)
except AttributeError: trans_dict=json.read(content)
return trans_dict['responseData']['translatedText']
print translate("Good morning to you friend!", "English", "German")
print translate("Good morning to you friend!", "English", "Italian")
print translate("Good morning to you friend!", "English", "Spanish")
Guten Morgen, du Freund!
Buongiorno a te amico!
Buenos días a ti amigo!
|
Manipulating a String: Removing special characters - Change all accented letters to non accented
Tag : chash , By : user87225
Date : March 29 2020, 07:55 AM
I hope this helps . Similar to mathieu's answer, but more custom made for you requirements. This solution first strips special characters and diacritics from the input string, and then replaces whitespace with dashes: string s = "#Hi this is rèally/ special strìng!!!";
string normalized = s.Normalize(NormalizationForm.FormD);
StringBuilder resultBuilder = new StringBuilder();
foreach (var character in normalized)
{
UnicodeCategory category = CharUnicodeInfo.GetUnicodeCategory(character);
if (category == UnicodeCategory.LowercaseLetter
|| category == UnicodeCategory.UppercaseLetter
|| category == UnicodeCategory.SpaceSeparator)
resultBuilder.Append(character);
}
string result = Regex.Replace(resultBuilder.ToString(), @"\s+", "-");
|
PHP-REGEX: accented letters matches non-accented ones, and vice versa. How to achieve this?
Tag : php , By : Roel van Dijk
Date : March 29 2020, 07:55 AM
Any of those help You can try to make a function to create your regex expression based on your txt_search, replacing any possible match to all possible matches like this: function search_term($txt_search) {
$search = preg_quote($txt_search);
$search = preg_replace('/[aàáâãåäæ]/iu', '[aàáâãåäæ]', $search);
$search = preg_replace('/[eèéêë]/iu', '[eèéêë]', $search);
$search = preg_replace('/[iìíîï]/iu', '[iìíîï]', $search);
$search = preg_replace('/[oòóôõöø]/iu', '[oòóôõöø]', $search);
$search = preg_replace('/[uùúûü]/iu', '[uùúûü]', $search);
// add any other character
return $search;
}
|
Html Text-area: problems with accented letters
Date : March 29 2020, 07:55 AM
like below fixes the issue When in in a text-area I write words with acceted letters ....the application store the words in mysql with some errors E.g. if i write può in my sql I have può , To change an existing table to use the UTF-8 charset: ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
|
Replace accented letters with the respective non-accented ones at Python 3
Date : December 17 2020, 07:32 AM
I wish did fix the issue. The linked answer references the third-party module unidecode, not Python 2's unicode type. $ python3
Python 3.7.1 (default, Nov 19 2018, 13:04:22)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import unidecode
>>> unidecode.unidecode('intérêt')
'interet'
|