As always - the context !
Preliminaries
Recently I needed a way to transport data from one environment to another. Though the proper way (probably) to do this would be to use databases (which I don't know how to work with) I chose writing the data to a simple text file which in turn is being read and parsed when needed. So typical content of such a file (real example so the alignment) :
Ticket: 2 Type: ORDER_TYPE_SELL_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1507032000000 End: 1507140000000 Ticket: 3 Type: ORDER_TYPE_BUY_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1507723200000 End: 1507760140000 Ticket: 4 Type: ORDER_TYPE_SELL_STOP State: ORDER_STATE_FILLED Reason: ORDER_REASON_EXPERT Start: 1508375100000 End: 1508389780000 Ticket: 5 Type: ORDER_TYPE_BUY State: ORDER_STATE_FILLED Reason: ORDER_REASON_SL Start: 1508392000000 End: 1508392000000 Ticket: 6 Type: ORDER_TYPE_SELL_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1508522400000 End: 1508524540000 Ticket: 7 Type: ORDER_TYPE_BUY_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1509624000000 End: 1509638920000 Ticket: 8 Type: ORDER_TYPE_BUY_STOP State: ORDER_STATE_CANCELED Reason: ORDER_REASON_EXPERT Start: 1509671100000 End: 1509732000000
where each line is (must be) of the form :
Type : ORDER_TYPE_SELL_STOP State : .... ^ ^ ^ ^ ^ ^ ^ ^ ^ WPs KEY WPs D WPs VALUE WPs KEY D
where
WPs - any number of whitespace characters D - delimiter character, separates KEY and VALUE KEY, VALUE - WORDs : which contain neither a whitespace character nor the delimiter one
Note that one line may contain several KEY
-VALUE
pairs.
The Algorithm
Take a string S Remove first WP characters if any (until the first non-WP is met) from S if MAX_INIT_WORD of S is NOT EMPTY KEY := MAX_INIT_WORD Remove MAX_INIT_WORD from S else break Remove first WP characters if any (until the first non-WP is met) from S if S starts with D Remove D from S else break Remove first WP characters if any ( until the first non-WP is met) from S if MAX_INIT_WORD of S is NOT EMPTY VALUE := MAX_INIT_WORD Remove MAX_INIT_WORD from S
where MAX_INIT_WORD
is a maximal prefix of S
that doesn't contain WP
or D
.
The C++ code
#include <map> #include <string> using dict_t = std::map<std::string, std::string>; dict_t GetDict( std::string s, char delim = ':', const std::string& white = " \n\t\v\r\f" ) { dict_t m; if( white.empty() ) return m; s += white[0];// necessary if s doesn't contain trailing spaces size_t pos = 0; auto removeLeading = [&](){ if( (pos = s.find_first_not_of( white )) != std::string::npos) s.erase( 0, pos ); }; auto maxInitWord = [&]() -> std::string { std::string word; if( (pos = s.find_first_of( white + delim )) != std::string::npos ) { word = s.substr( 0, pos ); s.erase( 0, pos ); } return word; }; while( true ) { std::string key; removeLeading(); if( (key = maxInitWord()).empty() ) break; removeLeading(); if( s.empty() or s[0] != delim ) break; s.erase( 0, 1 ); removeLeading(); if( (m[key] = maxInitWord()).empty() ) break; } return m; }
Example (C++ interpreter used)
root [1] GetDict( "....Am-I...:...Inside...A:...Simulation?", ':', "." ) (dict_t) { "A" => "Simulation?", "Am-I" => "Inside" }
Evidently (thanks to @JDługosz), I should show the results on some "illegal" inputs. Though there are infinitely many of them, they can be divided into two groups: those which produce an empty map, and those which produce a map with an empty value:
root [6] GetDict( "....:....an empty map", ':', "." ) (dict_t) { }
and
root [7] GetDict( "an empty value....:....", ':', "." ) (dict_t) { "an empty value" => "" }
Of course, you can see a "partially" correct content:
root [8] GetDict( "partially....:....correct...content...:....", ':', "." ) (dict_t) { "content" => "", "partially" => "correct" }