Using regex to bulk find and replace text with superscript Thread poster: Rodrigo Rosales Sosa
|
Hello there, colleagues: I have a significant number of segments where I need to replace numbers in pairs with those same numbers in superscript (in this particular case, numbers in scientific notation, like 7.00E+02 or 7.00E-02 (meaning, 7 × 102 or 7 × 10−2) (2 o −2 in superscript) (English into Spanish) Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2})... See more Hello there, colleagues: I have a significant number of segments where I need to replace numbers in pairs with those same numbers in superscript (in this particular case, numbers in scientific notation, like 7.00E+02 or 7.00E-02 (meaning, 7 × 102 or 7 × 10−2) (2 o −2 in superscript) (English into Spanish) Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2}) needs to be set to a superscript font (I won't be exporting the document, just sending the bilingual, so I won't be doing it in Word). I can, of course, settle for the power sign (^) and be done with it, but I'd like to know if that is possible through regex. Kind regards and appreciate your help. ▲ Collapse | | | James Plastow United Kingdom Local time: 03:30 Member (2020) Japanese to English
I once had a job like this and did it by batch replacing the tags in the sdlxliff file directly in notepad++. If you are careful it is easy enough but be sure to make a backup as you will corrupt the file if you make any mistake. | | | No tags in this case, I'm afraid | Jul 3, 2022 |
Hello, James: I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible? | | | Dan Lucas United Kingdom Local time: 03:30 Member (2014) Japanese to English Unicode points? | Jul 3, 2022 |
Rodrigo Rosales Sosa wrote: Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2}) Superscript minus, superscript two, and many similar symbols exist as unicode points. So you might be able to get this kind of thing: ⁻². I would use this site to look up the code points (just type in "superscript"), then it's just a case of working out how MemoQ handles unicode in its regex engine. If it uses the .NET flavour of regexes it would probably be something like \u207B for superscript minus. Dan | |
|
|
James Plastow United Kingdom Local time: 03:30 Member (2020) Japanese to English
Rodrigo Rosales Sosa wrote: Hello, James: I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible? Hi Rodrigo, Are you working in Trados? If you are, try opening the xliff in Notepad++ and see what is there. (it helps to install an XML plugin so you can see the text more clearly). There should be tags where there is a superscript. You can batch find and replace these to the other elements you want to make superscript. Dan's solution sounds quicker though.
[Edited at 2022-07-03 19:51 GMT] | | | Stepan Konev Russian Federation Local time: 05:30 English to Russian A series of replacements | Jul 3, 2022 |
You have to run a number of replacements. Begin with E-: 1. Replace E-(\d+) with ×10@@-$1 2. Replace @@-01 with ⁻¹ 3. Replace @@-02 with ⁻² 4. Replace @@-03 with ⁻³ 5. Replace @@-04 with ⁻⁴ etc. Then 1. Replace E\+(\d+) with ×10@@$1 2. Replace @@01 with blank field 3. Replace @@02 with ² 4. Replace @@03 with ³ 5. Replace @@04 with ⁴ etc.
[Edited at 2022-07-03 21:20 GMT] | | | I'll try it out and report back. | Jul 4, 2022 |
Stepan Konev wrote: You have to run a number of replacements. Begin with E-: 1. Replace E-(\d+) with ×10@@-$1 2. Replace @@-01 with ⁻¹ 3. Replace @@-02 with ⁻² 4. Replace @@-03 with ⁻³ 5. Replace @@-04 with ⁻⁴ etc. Then 1. Replace E\+(\d+) with ×10@@$1 2. Replace @@01 with blank field 3. Replace @@02 with ² 4. Replace @@03 with ³ 5. Replace @@04 with ⁴ etc.
[Edited at 2022-07-03 21:20 GMT] Why didn't I think about copying the numbers already in superscript? I'll report back later. Thank you | | | I'll check it out | Jul 4, 2022 |
Dan Lucas wrote: Rodrigo Rosales Sosa wrote: Since it's impractical doing it case by base, I'm using the following regex for the bulk find and replace: (\d)(\.)(\d{1,2})E(\+|-)(\d{2}), where (\d{2}) Superscript minus, superscript two, and many similar symbols exist as unicode points. So you might be able to get this kind of thing: ⁻². I would use this site to look up the code points (just type in "superscript"), then it's just a case of working out how MemoQ handles unicode in its regex engine. If it uses the .NET flavour of regexes it would probably be something like \u207B for superscript minus. Dan Thank you, Dan. I'll check this option out and report back later | |
|
|
Hello there: I managed to solve the issue by finding the unicode characters for each superscript number and the minus operator sign (−) and running a series of replacements starting from 1 and voilà. (link to screenshot for future reference: https://imgur.com/a/hVVlbPU). Thank you for your suggestions and help. | | | Sorry, I should've mentioned | Jul 5, 2022 |
James Plastow wrote: Rodrigo Rosales Sosa wrote: Hello, James: I would try that option, but there are no tags in this case. I'm wondering if I could create superscript tags. Is that possible? Hi Rodrigo, Are you working in Trados? If you are, try opening the xliff in Notepad++ and see what is there. (it helps to install an XML plugin so you can see the text more clearly). There should be tags where there is a superscript. You can batch find and replace these to the other elements you want to make superscript. Dan's solution sounds quicker though. [Edited at 2022-07-03 19:51 GMT] I should've mentioned it earlier: I'm working in memoQ. I did try Dan's solution and it worked. Thank you | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Using regex to bulk find and replace text with superscript Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |