Using Raku (formerly known as Perl_6)
raku -pe 's:g/ \<DATA ( <+[0..9]+[-T:.]>+ ) \s\/\> /{"<DATA>"~$0~"</DATA>"}/;'
OR
raku -pe 's:g[ "<DATA" ( <+[0..9]+[-T:.]>+ ) " />" ] = ["<DATA>"~$0~"</DATA>"];'
Sample Input:
<DATA2020-04-13T08:59:05.427 />
Sample Output:
<DATA>2020-04-13T08:59:05.427</DATA>
Above are answers coded in Raku, a member of the Perl family of programming languages. Both examples above have four general characteristics of note:
No guessing on backslashed characters: if it isn't <alnum>
(alphanumeric or underscore) it needs to be escaped,
Regex modifiers like /g
for global
now go at the head of the s///
form immediately after the s
, preceded by a colon. And either s:global
or s:g
works,
Perl's /x
modifier is now the default in Raku (allows liberal whitespace between regex atoms), and
String concatenation in Raku is accomplished with ~
tilde.
Both examples above use an enumerated character class <+[0..9]+[-T:.]>
, which very simply consists of the digits [0..9
], plus the four characters [-
T
:
.
]. Also, while the first example above follows the traditional s///
substitution idiom, the second example above uses Raku's new 'sans-backslash' replacement format (with an =
equals sign in-between), which some readers may find to be more readable.
Finally, if you have any interest in DateTime extraction/modification, Raku has you covered:
~$ echo '<DATA2020-04-13T08:59:05.427 />' | raku -pe 's:g[ "<DATA" ( <+[0..9]+[-T:.]>+ ) " />" ] = [DateTime($0~"Z")];' 2020-04-13T08:59:05.427000Z ~$ echo '<DATA2020-04-13T08:59:05.427 />' | raku -pe 's:g[ "<DATA" ( <+[0..9]+[-T:.]>+ ) " />" ] = [DateTime(now) - DateTime($0~"Z")];' 54862286.622457
https://docs.raku.org/language/regexes#Enumerated_character_classes_and_ranges
https://docs.raku.org/routine/DateTime
https://raku.org