216

The Windows FINDSTR command is horribly documented. There is very basic command line help available through FINDSTR /?, or HELP FINDSTR, but it is woefully inadequate. There is a wee bit more documentation online at https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/findstr.

There are many FINDSTR features and limitations that are not even hinted at in the documentation. Nor could they be anticipated without prior knowledge and/or careful experimentation.

So the question is - What are the undocumented FINDSTR features and limitations?

The purpose of this question is to provide a one stop repository of the many undocumented features so that:

A) Developers can take full advantage of the features that are there.

B) Developers don't waste their time wondering why something doesn't work when it seems like it should.

Please make sure you know the existing documentation before responding. If the information is covered by the HELP, then it does not belong here.

Neither is this a place to show interesting uses of FINDSTR. If a logical person could anticipate the behavior of a particular usage of FINDSTR based on the documentation, then it does not belong here.

Along the same lines, if a logical person could anticipate the behavior of a particular usage based on information contained in any existing answers, then again, it does not belong here.

19
  • 18
    Or, alternatively, you could ditch the crappy undocumented MS utility altogether and install/use grep which is very well understood and documented :-) See stackoverflow.com/questions/2635740/… for example.
    – paxdiablo
    CommentedJan 13, 2012 at 1:41
  • 25
    By all means, if you are in a position to use something other than FINDSTR, then that is highly advised. But some people are in environments where 3rd party utilities are forbidden.
    – dbenham
    CommentedJan 13, 2012 at 1:44
  • 5
    No offense taken. I seriously considered putting in my own FINDSTR disclaimer that was similar to your comment! :)
    – dbenham
    CommentedJan 13, 2012 at 2:00
  • 53
    I am shocked and dissapointed someone would find this question "Not Constructive" and vote to close. A lot of thought went into the question specifically to avoid "opinion, debate, arguments, polling, or extended discussion". The question has been posted for 3.5 months, and none of the negatives cited have occurred. The paired answer is filled with facts, and required many hours of painstaking research and experimentation.
    – dbenham
    CommentedApr 30, 2012 at 22:30
  • 7
    Some readers may be interested in the historical context of the findstr command: blogs.msdn.com/b/oldnewthing/archive/2012/11/28/10372436.aspxCommentedNov 28, 2012 at 21:49

8 Answers 8

317

Preface
Much of the information in this answer has been gathered based on experiments run on a Vista machine. Unless explicitly stated otherwise, I have not confirmed whether the information applies to other Windows versions.

FINDSTR output
The documentation never bothers to explain the output of FINDSTR. It alludes to the fact that matching lines are printed, but nothing more.

The format of matching line output is as follows:

filename:lineNumber:lineOffset:text

where

fileName: = The name of the file containing the matching line. The file name is not printed if the request was explicitly for a single file, or if searching piped input or redirected input. When printed, the fileName will always include any path information provided. Additional path information will be added if the /S option is used. The printed path is always relative to the provided path, or relative to the current directory if none provided.

Note - The filename prefix can be avoided when searching multiple files by using the non-standard (and poorly documented) wildcards< and >. The exact rules for how these wildcards work can be found here. Finally, you can look at this example of how the non-standard wildcards work with FINDSTR.

lineNumber: = The line number of the matching line represented as a decimal value with 1 representing the 1st line of the input. Only printed if /N option is specified.

lineOffset: = The decimal byte offset of the start of the matching line, with 0 representing the 1st character of the 1st line. Only printed if /O option is specified. This is not the offset of the match within the line. It is the number of bytes from the beginning of the file to the beginning of the line.

text = The binary representation of the matching line, including any <CR> and/or <LF>. Nothing is left out of the binary output, such that this example that matches all lines will produce an exact binary copy of the original file.

FINDSTR "^" FILE >FILE_COPY 

The /A option sets the color of the fileName:, lineNumber:, and lineOffset: output only. The text of the matching line is always output with the current console color. The /A option only has effect when output is displayed directly to the console. The /A option has no effect if the output is redirected to a file or piped. See the 2018-08-18 edit in Aacini's answer for a description of the buggy behavior when output is redirected to CON.

Most control characters and many extended ASCII characters display as dots on XP
FINDSTR on XP displays most non-printable control characters from matching lines as dots (periods) on the screen. The following control characters are exceptions; they display as themselves: 0x09 Tab, 0x0A LineFeed, 0x0B Vertical Tab, 0x0C Form Feed, 0x0D Carriage Return.

XP FINDSTR also converts a number of extended ASCII characters to dots as well. The extended ASCII characters that display as dots on XP are the same as those that are transformed when supplied on the command line. See the "Character limits for command line parameters - Extended ASCII transformation" section, later in this post

Control characters and extended ASCII are not converted to dots on XP if the output is piped, redirected to a file, or within a FOR IN() clause.

Vista and Windows 7 always display all characters as themselves, never as dots.

Return Codes (ERRORLEVEL)

  • 0 (success)
    • Match was found in at least one line of at least one file.
  • 1 (failure)
    • No match was found in any line of any file.
    • Invalid color specified by /A:xx option
  • 2 (error)
    • Incompatible options /L and /R both specified
    • Missing argument after /A:, /F:, /C:, /D:, or /G:
    • File specified by /F:file or /G:file not found
  • 255 (error)

Source of data to search(Updated based on tests with Windows 7)
Findstr can search data from only one of the following sources:

  • filenames specified as arguments and/or using the /F:file option.

  • stdin via redirection findstr "searchString" <file

  • data stream from a pipe type file | findstr "searchString"

Arguments/options take precedence over redirection, which takes precedence over piped data.

File name arguments and /F:file may be combined. Multiple file name arguments may be used. If multiple /F:file options are specified, then only the last one is used. Wild cards are allowed in filename arguments, but not within the file pointed to by /F:file.

Source of search strings(Updated based on tests with Windows 7)
The /G:file and /C:string options may be combined. Multiple /C:string options may be specified. If multiple /G:file options are specified, then only the last one is used. If either /G:file or /C:string is used, then all non-option arguments are assumed to be files to search. If neither /G:file nor /C:string is used, then the first non-option argument is treated as a space delimited list of search terms.

File names must not be quoted within the file when using the /F:FILE option.
File names may contain spaces and other special characters. Most commands require that such file names are quoted. But the FINDSTR /F:files.txt option requires that filenames within files.txt must NOT be quoted. The file will not be found if the name is quoted.

BUG - Short 8.3 filenames can break the /D and /S options
As with all Windows commands, FINDSTR will attempt to match both the long name and the short 8.3 name when looking for files to search. Assume the current folder contains the following non-empty files:

b1.txt b.txt2 c.txt 

The following command will successfully find all 3 files:

findstr /m "^" *.txt 

b.txt2 matches because the corresponding short name B9F64~1.TXT matches. This is consistent with the behavior of all other Windows commands.

But a bug with the /D and /S options causes the following commands to only find b1.txt

findstr /m /d:. "^" *.txt findstr /m /s "^" *.txt 

The bug prevents b.txt2 from being found, as well as all file names that sort after b.txt2 within the same directory. Additional files that sort before, like a.txt, are found. Additional files that sort later, like d.txt, are missed once the bug has been triggered.

Each directory searched is treated independently. For example, the /S option would successfully begin searching in a child folder after failing to find files in the parent, but once the bug causes a short file name to be missed in the child, then all subsequent files in that child folder would also be missed.

The commands work bug free if the same file names are created on a machine that has NTFS 8.3 name generation disabled. Of course b.txt2 would not be found, but c.txt would be found properly.

Not all short names trigger the bug. All instances of bugged behavior I have seen involve an extension that is longer than 3 characters with a short 8.3 name that begins the same as a normal name that does not require an 8.3 name.

The bug has been confirmed on XP, Vista, and Windows 7.

Non-Printable characters and the /P option
The /P option causes FINDSTR to skip any file that contains any of the following decimal byte codes:
0-7, 14-25, 27-31.

Put another way, the /P option will only skip files that contain non-printable control characters. Control characters are codes less than or equal to 31 (0x1F). FINDSTR treats the following control characters as printable:

 8 0x08 backspace 9 0x09 horizontal tab 10 0x0A line feed 11 0x0B vertical tab 12 0x0C form feed 13 0x0D carriage return 26 0x1A substitute (end of text) 

All other control characters are treated as non-printable, the presence of which causes the /P option to skip the file.

Piped and Redirected input may have <CR><LF> appended
If the input is piped in and the last character of the stream is not <LF>, then FINDSTR will automatically append <CR><LF> to the input. This has been confirmed on XP, Vista and Windows 7. (I used to think that the Windows pipe was responsible for modifying the input, but I have since discovered that FINDSTR is actually doing the modification.)

The same is true for redirected input on Vista. If the last character of a file used as redirected input is not <LF>, then FINDSTR will automatically append <CR><LF> to the input. However, XP and Windows 7 do not alter redirected input.

FINDSTR hangs on XP and Windows 7 if redirected input does not end with <LF>
This is a nasty "feature" on XP and Windows 7. If the last character of a file used as redirected input does not end with <LF>, then FINDSTR will hang indefinitely once it reaches the end of the redirected file.

Last line of Piped data may be ignored if it consists of a single character
If the input is piped in and the last line consists of a single character that is not followed by <LF>, then FINDSTR completely ignores the last line.

Example - The first command with a single character and no <LF> fails to match, but the second command with 2 characters works fine, as does the third command that has one character with terminating newline.

> set /p "=x" <nul | findstr "^" > set /p "=xx" <nul | findstr "^" xx > echo x| findstr "^" x 

Reported by DosTips user Sponge Belly at new findstr bug. Confirmed on XP, Windows 7 and Windows 8. Haven't heard about Vista yet. (I no longer have Vista to test).

Option syntax
Option letters are not case sensitive, so /i and /I are equivalent.

Options can be prefixed with either / or - Options may be concatenated after a single / or -. However, the concatenated option list may contain at most one multicharacter option such as OFF or F:, and the multi-character option must be the last option in the list.

The following are all equivalent ways of expressing a case insensitive regex search for any line that contains both "hello" and "goodbye" in any order

  • /i /r /c:"hello.*goodbye" /c:"goodbye.*hello"

  • -i -r -c:"hello.*goodbye" /c:"goodbye.*hello"

  • /irc:"hello.*goodbye" /c:"goodbye.*hello"

Options may also be quoted. So /i, -i, "/i" and "-i" are all equivalent. Likewise, /c:string, "/c":string, "/c:"string and "/c:string" are all equivalent.

If a search string begins with a / or - literal, then the /C or /G option must be used. Thanks to Stephan for reporting this in a comment (since deleted).

If the /c:string or /g:file option is used, then the command will fail if the file name argument begins with -, even if quoted. This is because there is no search string argument, so the file name argument is then treated as an option. The easiest workaround is to prefix the file argument with dot backslash, as in

findstr /c:"searchString" ".\-fileName.txt" 

Search String length limits
On Vista the maximum allowed length for a single search string is 511 bytes. If any search string exceeds 511 then the result is a FINDSTR: Search string too long. error with ERRORLEVEL 2.

When doing a regular expression search, the maximum search string length is 254. A regular expression with length between 255 and 511 will result in a FINDSTR: Out of memory error with ERRORLEVEL 2. A regular expression length >511 results in the FINDSTR: Search string too long. error.

On Windows XP the search string length is apparently shorter. Findstr error: "Search string too long": How to extract and match substring in "for" loop? The XP limit is 127 bytes for both literal and regex searches.

Line Length limits
Files specified as a command line argument or via the /F:FILE option have no known line length limit. Searches were successfully run against a 128MB file that did not contain a single <LF>.

Piped data and Redirected input is limited to 8191 bytes per line. This limit is a "feature" of FINDSTR. It is not inherent to pipes or redirection. FINDSTR using redirected stdin or piped input will never match any line that is >=8k bytes. Lines >= 8k generate an error message to stderr, but ERRORLEVEL is still 0 if the search string is found in at least one line of at least one file.

Default type of search: Literal vs Regular Expression
/C:"string" - The default is /L literal. Explicitly combining the /L option with /C:"string" certainly works but is redundant.

"string argument" - The default depends on the content of the very first search string. (Remember that <space> is used to delimit search strings.) If the first search string is a valid regular expression that contains at least one un-escaped meta-character, then all search strings are treated as regular expressions. Otherwise all search strings are treated as literals. For example, "51.4 200" will be treated as two regular expressions because the first string contains an un-escaped dot, whereas "200 51.4" will be treated as two literals because the first string does not contain any meta-characters.

/G:file - The default depends on the content of the first non-empty line in the file. If the first search string is a valid regular expression that contains at least one un-escaped meta-character, then all search strings are treated as regular expressions. Otherwise all search strings are treated as literals.

Recommendation - Always explicitly specify /L literal option or /R regular expression option when using "string argument" or /G:file.

BUG - Specifying multiple literal search strings can give unreliable results

The following simple FINDSTR example fails to find a match, even though it should.

echo ffffaaa|findstr /l "ffffaaa faffaffddd" 

This bug has been confirmed on Windows Server 2003, Windows XP, Vista, and Windows 7.

Based on experiments, FINDSTR may fail if all of the following conditions are met:

  • The search is using multiple literal search strings
  • The search strings are of different lengths
  • A short search string has some amount of overlap with a longer search string
  • The search is case sensitive (no /I option)

In every failure I have seen, it is always one of the shorter search strings that fails.

For more info see Why doesn't this FINDSTR example with multiple literal search strings find a match?

Quotes and backslahses within command line arguments
Note -User MC ND's comments reflect the actual horrifically complicated rules for this section. There are 3 distinct parsing phases involved:

  • First cmd.exe may require some quotes to be escaped as ^" (really nothing to do with FINDSTR)
  • Next FINDSTR uses the pre 2008 MS C/C++ argument parser, which has special rules for " and \
  • After the argument parser finishes, FINDSTR additionally treats \ followed by an alpha-numeric character as literal, but \ followed by non-alpha-numeric character as an escape character

The remainder of this highlighted section is not 100% correct. It can serve as a guide for many situations, but the above rules are required for total understanding.

Escaping Quote within command line search strings
Quotes within command line search strings must be escaped with backslash like \". This is true for both literal and regex search strings. This information has been confirmed on XP, Vista, and Windows 7.

Note: The quote may also need to be escaped for the CMD.EXE parser, but this has nothing to do with FINDSTR. For example, to search for a single quote you could use:

FINDSTR \^" file && echo found || echo not found

Escaping Backslash within command line literal search strings
Backslash in a literal search string can normally be represented as \ or as \\. They are typically equivalent. (There may be unusual cases in Vista where the backslash must always be escaped, but I no longer have a Vista machine to test).

But there are some special cases:

When searching for consecutive backslashes, all but the last must be escaped. The last backslash may optionally be escaped.

  • \\ can be coded as \\\ or \\\\
  • \\\ can be coded as \\\\\ or \\\\\\

Searching for one or more backslashes before a quote is bizarre. Logic would suggest that the quote must be escaped, and each of the leading backslashes would need to be escaped, but this does not work! Instead, each of the leading backslashes must be double escaped, and the quote is escaped normally:

  • \" must be coded as \\\\\"
  • \\" must be coded as \\\\\\\\\"

As previously noted, one or more escaped quotes may also require escaping with ^ for the CMD parser

The info in this section has been confirmed on XP and Windows 7.

Escaping Backslash within command line regex search strings

  • Vista only: Backslash in a regex must be either double escaped like \\\\, or else single escaped within a character class set like [\\]

  • XP and Windows 7: Backslash in a regex can always be represented as [\\]. It can normally be represented as \\. But this never works if the backslash precedes an escaped quote.

    One or more backslashes before an escaped quote must either be double escaped, or else coded as [\\]

    • \" may be coded as \\\\\" or [\\]\"
    • \\" may be coded as \\\\\\\\\" or [\\][\\]\" or \\[\\]\"

Escaping Quote and Backslash within /G:FILE literal search strings
Standalone quotes and backslashes within a literal search string file specified by /G:file need not be escaped, but they can be.

" and \" are equivalent.

\ and \\ are equivalent.

If the intent is to find \\, then at least the leading backslash must be escaped. Both \\\ and \\\\ work.

If the intent is to find ", then at least the leading backslash must be escaped. Both \\" and \\\" work.

Escaping Quote and Backslash within /G:FILE regex search strings
This is the one case where the escape sequences work as expected based on the documentation. Quote is not a regex metacharacter, so it need not be escaped (but can be). Backslash is a regex metacharacter, so it must be escaped.

Character limits for command line parameters - Extended ASCII transformation
The null character (0x00) cannot appear in any string on the command line. Any other single byte character can appear in the string (0x01 - 0xFF). However, FINDSTR converts many extended ASCII characters it finds within command line parameters into other characters. This has a major impact in two ways:

  1. Many extended ASCII characters will not match themselves if used as a search string on the command line. This limitation is the same for literal and regex searches. If a search string must contain extended ASCII, then the /G:FILE option should be used instead.

  2. FINDSTR may fail to find a file if the name contains extended ASCII characters and the file name is specified on the command line. If a file to be searched contains extended ASCII in the name, then the /F:FILE option should be used instead.

Here is a complete list of extended ASCII character transformations that FINDSTR performs on command line strings. Each character is represented as the decimal byte code value. The first code represents the character as supplied on the command line, and the second code represents the character it is transformed into. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.

158 treated as 080 199 treated as 221 226 treated as 071 169 treated as 170 200 treated as 043 227 treated as 112 176 treated as 221 201 treated as 043 228 treated as 083 177 treated as 221 202 treated as 045 229 treated as 115 178 treated as 221 203 treated as 045 231 treated as 116 179 treated as 221 204 treated as 221 232 treated as 070 180 treated as 221 205 treated as 045 233 treated as 084 181 treated as 221 206 treated as 043 234 treated as 079 182 treated as 221 207 treated as 045 235 treated as 100 183 treated as 043 208 treated as 045 236 treated as 056 184 treated as 043 209 treated as 045 237 treated as 102 185 treated as 221 210 treated as 045 238 treated as 101 186 treated as 221 211 treated as 043 239 treated as 110 187 treated as 043 212 treated as 043 240 treated as 061 188 treated as 043 213 treated as 043 242 treated as 061 189 treated as 043 214 treated as 043 243 treated as 061 190 treated as 043 215 treated as 043 244 treated as 040 191 treated as 043 216 treated as 043 245 treated as 041 192 treated as 043 217 treated as 043 247 treated as 126 193 treated as 045 218 treated as 043 249 treated as 250 194 treated as 045 219 treated as 221 251 treated as 118 195 treated as 043 220 treated as 095 252 treated as 110 196 treated as 045 222 treated as 221 254 treated as 221 197 treated as 043 223 treated as 095 198 treated as 221 224 treated as 097 

Any character >0 not in the list above is treated as itself, including <CR> and <LF>. The easiest way to include odd characters like <CR> and <LF> is to get them into an environment variable and use delayed expansion within the command line argument.

Character limits for strings found in files specified by /G:FILE and /F:FILE options
The nul (0x00) character can appear in the file, but it functions like the C string terminator. Any characters after a nul character are treated as a different string as if they were on another line.

The <CR> and <LF> characters are treated as line terminators that terminate a string, and are not included in the string.

All other single byte characters are included perfectly within a string.

Searching Unicode files
FINDSTR cannot properly search most Unicode (UTF-16, UTF-16LE, UTF-16BE, UTF-32) because it cannot search for nul bytes and Unicode typically contains many nul bytes.

However, the TYPE command converts UTF-16LE with BOM to a single byte character set, so a command like the following will work with UTF-16LE with BOM.

type unicode.txt|findstr "search" 

Note that Unicode code points that are not supported by your active code page will be converted to ? characters.

It is possible to search UTF-8 as long as your search string contains only ASCII. However, the console output of any multi-byte UTF-8 characters will not be correct. But if you redirect the output to a file, then the result will be correctly encoded UTF-8. Note that if the UTF-8 file contains a BOM, then the BOM will be considered as part of the first line, which could throw off a search that matches the beginning of a line.

It is possible to search multi-byte UTF-8 characters if you put your search string in a UTF-8 encoded search file (without BOM), and use the /G option.

End Of Line
FINDSTR breaks lines immediately after every <LF>. The presence or absence of <CR> has no impact on line breaks.

Searching across line breaks
As expected, the . regex metacharacter will not match <CR> or <LF>. But it is possible to search across a line break using a command line search string. Both the <CR> and <LF> characters must be matched explicitly. If a multi-line match is found, only the 1st line of the match is printed. FINDSTR then doubles back to the 2nd line in the source and begins the search all over again - sort of a "look ahead" type feature.

Assume TEXT.TXT has these contents (could be Unix or Windows style)

A A A B A A 

Then this script

@echo off setlocal ::Define LF variable containing a linefeed (0x0A) set LF=^ ::Above 2 blank lines are critical - do not remove ::Define CR variable containing a carriage return (0x0D) for /f %%a in ('copy /Z "%~dpf0" nul') do set "CR=%%a" setlocal enableDelayedExpansion ::regex "!CR!*!LF!" will match both Unix and Windows style End-Of-Line findstr /n /r /c:"A!CR!*!LF!A" TEST.TXT 

gives these results

1:A 2:A 5:A 

Searching across line breaks using the /G:FILE option is imprecise because the only way to match <CR> or <LF> is via a regex character class range expression that sandwiches the EOL characters.

  • [<TAB>-<0x0B>] matches <LF>, but it also matches <TAB> and <0x0B>

  • [<0x0C>-!] matches <CR>, but it also matches <0x0C> and !

Note - the above are symbolic representations of the regex byte stream since I can't graphically represent the characters.

Answer continued in part 2 below...

26
  • 56
    Outstanding completeness. If only all answers on the internet were like this.CommentedApr 9, 2012 at 22:57
  • 2
    EDIT - Described display of control characters as dots on XP. Also documented bugged /S and /D options stemming from short 8.3 file names.
    – dbenham
    CommentedNov 28, 2012 at 3:20
  • 2
    Just for information (I don't know if you already know it, but I see no mention in your answer). The reason of most of the "bizarre" backslash+quotes rules is that findstr is an exe file, and some rules control how backslashes+quotes are handled by the argument tokenizer, but once the arguments are parsed, the findstr code has a string that needs to be compiled into a regular expression instance. So, some backslashes are interpreted twice.
    – MC ND
    CommentedOct 4, 2016 at 6:29
  • 2
    A literal backslash does not need any escape (findstr /l \ *.cmd), but a double quoted literal backslash needs it (findstr /l "\\" *.cmd) to avoid the escaped quote. BUT the findstr string parser will handle a literal backslash followed by a non alphanumeric character ([a-zA-Z0-9]) as a escape character : findstr /l /c:"\ o" *.cmd searchs a space followed by a o character as the backslash escapes the space, but findstr /l /c:"\w" *.cmd searchs a backslash followed by a w character (it is alphanumeric, so it is not escaped)
    – MC ND
    CommentedOct 4, 2016 at 21:54
  • 2
    @dbenham, wondering if there should be a short blurb about the /A: option? The FINDSTR help does not specify that only the file name will be color coded when searching multiple files. One could infer from reading the help for the very first time that it could change the color of the string found in the output. I guess it technically isn't an undocumented feature or limitation but it certainly seems odd that Microsoft doesn't specifically point this out. The documentation on SS64 does.
    – Squashman
    CommentedAug 2, 2018 at 22:00
80

Answer continued from part 1 above - I've run into the 30,000 character answer limit :-(

Limited Regular Expressions (regex) Support
FINDSTR support for regular expressions is extremely limited. If it is not in the HELP documentation, it is not supported.

Beyond that, the regex expressions that are supported are implemented in a completely non-standard manner, such that results can be different then would be expected coming from something like grep or perl.

Regex Line Position anchors ^ and $
^ matches beginning of input stream as well as any position immediately following a <LF>. Since FINDSTR also breaks lines after <LF>, a simple regex of "^" will always match all lines within a file, even a binary file.

$ matches any position immediately preceding a <CR>. This means that a regex search string containing $ will never match any lines within a Unix style text file, nor will it match the last line of a Windows text file if it is missing the EOL marker of <CR><LF>.

Note - As previously discussed, piped and redirected input to FINDSTR may have <CR><LF> appended that is not in the source. Obviously this can impact a regex search that uses $.

Any search string with characters before ^ or after $ will always fail to find a match.

Positional Options /B /E /X
The positional options work the same as ^ and $, except they also work for literal search strings.

/B functions the same as ^ at the start of a regex search string.

/E functions the same as $ at the end of a regex search string.

/X functions the same as having both ^ at the beginning and $ at the end of a regex search string.

Regex word boundary
\< must be the very first term in the regex. The regex will not match anything if any other characters precede it. \< corresponds to either the very beginning of the input, the beginning of a line (the position immediately following a <LF>), or the position immediately following any "non-word" character. The next character need not be a "word" character.

\> must be the very last term in the regex. The regex will not match anything if any other characters follow it. \> corresponds to either the end of input, the position immediately prior to a <CR>, or the position immediately preceding any "non-word" character. The preceding character need not be a "word" character.

Here is a complete list of "non-word" characters, represented as the decimal byte code. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.

001 028 063 179 204 230 002 029 064 180 205 231 003 030 091 181 206 232 004 031 092 182 207 233 005 032 093 183 208 234 006 033 094 184 209 235 007 034 096 185 210 236 008 035 123 186 211 237 009 036 124 187 212 238 011 037 125 188 213 239 012 038 126 189 214 240 014 039 127 190 215 241 015 040 155 191 216 242 016 041 156 192 217 243 017 042 157 193 218 244 018 043 158 194 219 245 019 044 168 195 220 246 020 045 169 196 221 247 021 046 170 197 222 248 022 047 173 198 223 249 023 058 174 199 224 250 024 059 175 200 226 251 025 060 176 201 227 254 026 061 177 202 228 255 027 062 178 203 229 

Regex character class ranges [x-y]
Character class ranges do not work as expected. See this question: Why does findstr not handle case properly (in some circumstances)?, along with this answer: https://stackoverflow.com/a/8767815/1012053.

The problem is FINDSTR does not collate the characters by their byte code value (commonly thought of as the ASCII code, but ASCII is only defined from 0x00 - 0x7F). Most regex implementations would treat [A-Z] as all upper case English capital letters. But FINDSTR uses a collation sequence that roughly corresponds to how SORT works. So [A-Z] includes the complete English alphabet, both upper and lower case (except for "a"), as well as non-English alpha characters with diacriticals.

Below is a complete list of all characters supported by FINDSTR, sorted in the collation sequence used by FINDSTR to establish regex character class ranges. The characters are represented as their decimal byte code value. I believe the collation sequence makes the most sense if the characters are viewed using code page 437. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.

001 002 003 004 005 006 007 008 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 127 039 045 032 255 009 010 011 012 013 033 034 035 036 037 038 040 041 042 044 046 047 058 059 063 064 091 092 093 094 095 096 123 124 125 126 173 168 155 156 157 158 043 249 060 061 062 241 174 175 246 251 239 247 240 243 242 169 244 245 254 196 205 179 186 218 213 214 201 191 184 183 187 192 212 211 200 217 190 189 188 195 198 199 204 180 181 182 185 194 209 210 203 193 207 208 202 197 216 215 206 223 220 221 222 219 176 177 178 170 248 230 250 048 172 171 049 050 253 051 052 053 054 055 056 057 236 097 065 166 160 133 131 132 142 134 143 145 146 098 066 099 067 135 128 100 068 101 069 130 144 138 136 137 102 070 159 103 071 104 072 105 073 161 141 140 139 106 074 107 075 108 076 109 077 110 252 078 164 165 111 079 167 162 149 147 148 153 112 080 113 081 114 082 115 083 225 116 084 117 085 163 151 150 129 154 118 086 119 087 120 088 121 089 152 122 090 224 226 235 238 233 227 229 228 231 237 232 234 

Regex character class term limit and BUG
Not only is FINDSTR limited to a maximum of 15 character class terms within a regex, it fails to properly handle an attempt to exceed the limit. Using 16 or more character class terms results in an interactive Windows pop up stating "Find String (QGREP) Utility has encountered a problem and needs to close. We are sorry for the inconvenience." The message text varies slightly depending on the Windows version. Here is one example of a FINDSTR that will fail:

echo 01234567890123456|findstr [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] 

This bug was reported by DosTips user Judago here. It has been confirmed on XP, Vista, and Windows 7.

Regex searches fail (and may hang indefinitely) if they include byte code 0xFF (decimal 255)
Any regex search that includes byte code 0xFF (decimal 255) will fail. It fails if byte code 0xFF is included directly, or if it is implicitly included within a character class range. Remember that FINDSTR character class ranges do not collate characters based on the byte code value. Character <0xFF> appears relatively early in the collation sequence between the <space> and <tab> characters. So any character class range that includes both <space> and <tab> will fail.

The exact behavior changes slightly depending on the Windows version. Windows 7 hangs indefinitely if 0xFF is included. XP doesn't hang, but it always fails to find a match, and occasionally prints the following error message - "The process tried to write to a nonexistent pipe."

I no longer have access to a Vista machine, so I haven't been able to test on Vista.

Regex bug: . and [^anySet] can match End-Of-File
The regex . meta-character should only match any character other than <CR> or <LF>. There is a bug that allows it to match the End-Of-File if the last line in the file is not terminated by <CR> or <LF>. However, the . will not match an empty file.

For example, a file named "test.txt" containing a single line of x, without terminating <CR> or <LF>, will match the following:

findstr /r x......... test.txt 

This bug has been confirmed on XP and Win7.

The same seems to be true for negative character sets. Something like [^abc] will match End-Of-File. Positive character sets like [abc] seem to work fine. I have only tested this on Win7.

9
  • 1
    findstr is also buggy dealing with large files. Files > 2GB can cause findstr to hang. It doens't always happen. In confirming the bug I searched a file that was 2.3GB which didn't hang. It hangs even if searching only a single file. Workaround is to pipe the output of type into findstr.CommentedApr 7, 2014 at 11:15
  • It's probably also worthwhile explcitly mentioning that findstr supports multiple /c: search strings. I know your answers do demonstrate this. But it is something that is not documented; and I was quite surprised to learn of the feature after having used findstr without it for a few years.CommentedApr 7, 2014 at 11:20
  • @CraigYoung - You are right about the search string sources. I edited my answer, thanks.
    – dbenham
    CommentedApr 7, 2014 at 12:34
  • 1
    On further investigation, it looks like a variation on the LF issue you documented. I realised my test file didn't end in LF because I used copy in append mode to create it. I've put a command line session to demonstrate the issue into an answer (stackoverflow.com/a/22943056/224704). Note that input is not redirected, and yet the search hangs. The exact same search command doesn't hang with a smaller files that similarly don't end with LF.CommentedApr 8, 2014 at 16:43
  • 1
    New finding (Win7): findstr /R /C:"^[0-9][0-9]* [0-3][0-9][0-9]-[0-9][0-9]:[0-5][0-9]:[0-5][0-9]\.[0-9][0-9]* [0-9]*\.[0-9]*" (15 character classes) -- ErrorLevel = -1073740791 (0xC0000409), error dialog window: Find String (QGREP) Utility has stopped working; after removing one class or two meta characters (*\.), it works...
    – aschipfl
    CommentedMar 21, 2017 at 19:03
9

findstr sometimes hangs unexpectedly when searching large files.

I haven't confirmed the exact conditions or boundary sizes. I suspect any file larger 2GB may be at risk.

I have had mixed experiences with this, so it is more than just file size. This looks like it may be a variation on FINDSTR hangs on XP and Windows 7 if redirected input does not end with LF, but as demonstrated this particular problem manifests when input is not redirected.

The following command line session (Windows 7) demonstrates how findstr can hang when searching a 3GB file.

C:\Data\Temp\2014-04>echo 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890> T100B.txt C:\Data\Temp\2014-04>for /L %i in (1,1,10) do @type T100B.txt >> T1KB.txt C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1KB.txt >> T1MB.txt C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1MB.txt >> T1GB.txt C:\Data\Temp\2014-04>echo find this line>> T1GB.txt C:\Data\Temp\2014-04>copy T1GB.txt + T1GB.txt + T1GB.txt T3GB.txt T1GB.txt T1GB.txt T1GB.txt 1 file(s) copied. C:\Data\Temp\2014-04>dir Volume in drive C has no label. Volume Serial Number is D2B2-FFDF Directory of C:\Data\Temp\2014-04 2014/04/08 04:28 PM <DIR> . 2014/04/08 04:28 PM <DIR> .. 2014/04/08 04:22 PM 102 T100B.txt 2014/04/08 04:28 PM 1 020 000 016 T1GB.txt 2014/04/08 04:23 PM 1 020 T1KB.txt 2014/04/08 04:23 PM 1 020 000 T1MB.txt 2014/04/08 04:29 PM 3 060 000 049 T3GB.txt 5 File(s) 4 081 021 187 bytes 2 Dir(s) 51 881 050 112 bytes free C:\Data\Temp\2014-04>rem Findstr on the 1GB file does not hang C:\Data\Temp\2014-04>findstr "this" T1GB.txt find this line C:\Data\Temp\2014-04>rem On the 3GB file, findstr hangs and must be aborted... even though it clearly reaches end of file C:\Data\Temp\2014-04>findstr "this" T3GB.txt find this line find this line find this line ^C C:\Data\Temp\2014-04> 

Note, I've verified in a hex editor that all lines are terminated with CRLF. The only anomaly is that the file is terminated with 0x1A due to the way copy works. Note however, that this anomaly doesn't cause a problem on "small" files.

With additional testing I have confirmed the following:

  • Using copy with the /b option for binary files prevents the addition of the 0x1A character, and findstr doesn't hang on the 3GB file.
  • Terminating the 3GB file with a different character also causes a findstr to hang.
  • The 0x1A character doesn't cause any problems on a "small" file. (Similarly for other terminating characters.)
  • Adding CRLF after 0x1A resolves the problem. (LF by itself would probably suffice.)
  • Using type to pipe the file into findstr works without hanging. (This might be due to a side effect of either type or | that inserts an additional End Of Line.)
  • Use redirected input < also causes findstr to hang. But this is expected; as explained in dbenham's post: "redirected input must end in LF".
1
  • 1
    +1, I am able to confirm the problem on my Win7 machine. A file exactly 2GiB in size hung when the last character was not <LF>. A file two bytes smaller did not hang. Very nasty!
    – dbenham
    CommentedApr 18, 2014 at 11:13
9

When several commands are enclosed in parentheses and there are redirected files to the whole block:

< input.txt ( command1 command2 . . . ) > output.txt 

... then the files remains open as long as the commands in the block be active, so the commands may move the file pointer of the redirected files. Both MORE and FIND commands move the Stdin file pointer to the beginning of the file before process it, so the same file may be processed several times inside the block. For example, this code:

more < input.txt > output.txt more < input.txt >> output.txt 

... produce the same result than this one:

< input.txt ( more more ) > output.txt 

This code:

find "search string" < input.txt > matchedLines.txt find /V "search string" < input.txt > unmatchedLines.txt 

... produce the same result than this one:

< input.txt ( find "search string" > matchedLines.txt find /V "search string" > unmatchedLines.txt ) 

FINDSTR is different; it does not move the Stdin file pointer from its current position. For example, this code insert a new line after a search line:

call :ProcessFile < input.txt goto :EOF :ProcessFile rem Read the next line from Stdin and copy it set /P line= echo %line% rem Test if it is the search line if "%line%" neq "search line" goto ProcessFile rem Insert the new line at this point echo New line rem And copy the rest of lines findstr "^" exit /B 

We may make good use of this feature with the aid of an auxiliary program that allow us to move the file pointer of a redirected file, as shown in this example.

This behavior was first reported by jeb at this post.


EDIT 2018-08-18: New FINDSTR bug reported

The FINDSTR command have a strange bug that happen when this command is used to show characters in color AND the output of such a command is redirected to CON device. For details on how use FINDSTR command to show text in color, see this topic.

When the output of this form of FINDSTR command is redirected to CON, something strange happens after the text is output in the desired color: all the text after it is output as "invisible" characters, although a more precise description is that the text is output as black text over black background. The original text will appear if you use COLOR command to reset the foreground and background colors of the entire screen. However, when the text is "invisible" we could execute a SET /P command, so all characters entered will not appear on the screen. This behavior may be used to enter passwords.

@echo off setlocal set /P "=_" < NUL > "Enter password" findstr /A:1E /V "^$" "Enter password" NUL > CON del "Enter password" set /P "password=" cls color 07 echo The password read is: "%password%" 
    4

    FINDSTR has a color bug that I described and solved at https://superuser.com/questions/1535810/is-there-a-better-way-to-mitigate-this-obscure-color-bug-when-piping-to-findstr/1538802?noredirect=1#comment2339443_1538802

    To summarize that thread, the bug is that if input is piped to FINDSTR within a parenthesized block of code, inline ANSI escape colorcodes stop working in commands executed later. An example of inline colorcodes is: echo %magenta%Alert: Something bad happened%yellow% (where magenta and yellow are vars defined earlier in the .bat file as the corresponding ANSI escape colorcodes).

    My initial solution was to call a do-nothing subroutine after the FINDSTR. Somehow the call or the return "resets" whatever needs to be reset.

    Later I discovered another solution that presumably is more efficient: place the FINDSTR phrase within parentheses, as in the following example: echo success | ( FINDSTR /R success )Placing the FINDSTR phrase within a nested block of code appears to isolate FINDSTR's colorcode bug so it won't affect what's outside the nested block. Perhaps this technique will solve some other undesired FINDSTR side effects too.

    3
    • 1
      Great find. But your rules can be simplified (at least on my enterprise Windows 10 machine). FINDSTR prevents all console escape sequences from working for subsequent commands within the same command block. It doesn't matter if FINDSTR reads a pipe, redirected input, or a file. The escape sequence failure is not restricted to color codes. A command block is any set of commands within parentheses, and/or commands concatenated via &, &&, or ||
      – dbenham
      CommentedApr 9, 2020 at 15:23
    • @dbenham: Nice generalization of the problem. Do you know whether my solution -- nesting the FINDSTR phrase inside parentheses -- works in the general case too? And do you know whether my solution has any undesirable side effects?CommentedApr 9, 2020 at 16:34
    • I did not do any exhaustive testing, but yes, the nested parentheses seems to be a general solution, and I can't think of any possible undesirable side effects.
      – dbenham
      CommentedApr 9, 2020 at 16:45
    2

    I'd like to report a bug regarding the section Source of data to search in the first answer when using en dash (–) or em dash (—) within the filename.

    More specifically, if you are about to use the first option - filenames specified as arguments, the file won't be found. As soon as you use either option 2 - stdin via redirection or 3 - data stream from a pipe, findstr will find the file.

    For example, this simple batch script:

    echo off chcp 1250 > nul set INTEXTFILE1=filename with – dash.txt set INTEXTFILE2=filename with — dash.txt rem 3 way of findstr use with en dashed filename echo. echo Filename with en dash: echo. echo 1. As argument findstr . "%INTEXTFILE1%" echo. echo 2. As stdin via redirection findstr . < "%INTEXTFILE1%" echo. echo 3. As datastream from a pipe type "%INTEXTFILE1%" | findstr . echo. echo. rem The same set of operations with em dashed filename echo Filename with em dash: echo. echo 1. As argument findstr . "%INTEXTFILE2%" echo. echo 2. As stdin via redirection findstr . < "%INTEXTFILE2%" echo. echo 3. As datastream from a pipe type "%INTEXTFILE2%" | findstr . echo. pause 

    will print:

    Filename with en dash:

    1. As argument
      FINDSTR: Cannot open filename with - dash.txt

    2. As stdin via redirection
      I am the file with an en dash.

    3. As datastream from a pipe
      I am the file with an en dash.

    Filename with em dash:

    1. As argument
      FINDSTR: Cannot open filename with - dash.txt

    2. As stdin via redirection
      I am the file with an em dash.

    3. As datastream from a pipe
      I am the file with an em dash.

    Hope it helps.

    M.

    4
    • 1
      Hi matro, while your comments may be correct, I'm not sure they don't address the actual question.CommentedMar 22, 2015 at 11:49
    • I believe this is a unicode issue, which FINDSTR does not support. CMD.EXE redirection can properly open a file name with unicode, as can the TYPE command. But somewhere along the line, FINDSTR converts both the en-dash and em-dash to a normal dash, and of course the OS cannot find that name. If you create another file substituting a dash for the en-dash and/or em-dash, then FINDSTR will search the dash file if provided with a name containing en-dash or em-dash.
      – dbenham
      CommentedMar 22, 2015 at 14:21
    • I would classify this issue as a limitation rather than a bug.
      – dbenham
      CommentedMar 22, 2015 at 14:23
    • Actually, this isn't so much a unicode issue as it is extended ASCII. I already documented this issue in my original answer under the heading Character limits for command line parameters - Extended ASCII transformation. FINDSTR transforms a number of extended ASCII codes into "related" true ASCII, including the en-dash and em-dash.
      – dbenham
      CommentedDec 19, 2016 at 4:51
    2

    The findstr command sets the ErrorLevel (or exit code) to one of the following values, given that there are no invalid or incompatible switches and no search string exceeds the applicable length limit:

    • 0 when at least a single match is encountered in one line throughout all specified files;
    • 1 otherwise;

    A line is considered to contain a match when:

    • no /V option is given and the search expression occurs at least once;
    • the /V option is given and the search expression does not occur;

    This means that the /V option also changes the returned ErrorLevel, but it does not just revert it!

    For example, when you have got a file test.txt with two lines, one of which contains the string text but the other one does not, both findstr "text" "test.txt" and findstr /V "text" "test.txt" return an ErrorLevel of 0.

    Basically you can say: if findstr returns at least a line, ErrorLevel is set to 0, else to 1.

    Note that the /M option does not affect the ErrorLevel value, it just alters the output.

    (Just for the sake of completeness: the find command behaves exactly the same way with respect to the /V option and ErrorLevel; the /C option does not affect ErrorLevel.)

      -1

      /D tip for multiple directories: put your directory list before the search string. These all work:

      findstr /D:dir1;dir2 "searchString" *.* findstr /D:"dir1;dir2" "searchString" *.* findstr /D:"\path\dir1\;\path\dir2\" "searchString" *.* 

      As expected, the path is relative to location if you don't start the directories with \. Surrounding the path with " is optional if there are no spaces in the directory names. The ending \ is optional. The output of location will include whatever path you give it. It will work with or without surrounding the directory list with ".

      5
      • I don't see anything undocumented here. The /D option is described in the built in help. This is not a question for general tips on how to use FINDSTR. It is strictly intended to list undocumented features, limitations, and/or bugs.
        – dbenham
        CommentedJan 22, 2015 at 20:31
      • 1
        @dbenham true it's not truly undocumented, but I found I had to dicker with findstr to get the results I wanted and am sharing what I found DID work so folks wouldn't waste time experimenting with commands that do NOT work. hth (I am sad you dislike my input - it was only intended to be constructive)
        – gordon
        CommentedApr 20, 2015 at 21:29
      • 2
        IMHO the /D switch is clearly described in the built in help: /D:dirlist Search a semicolon-delimited list of directories and it is placed before the search string, so I don't understand what exactly is what "you found" about the /D switch (and what are the "commands that do NOT work")...
        – Aacini
        CommentedOct 8, 2015 at 18:50
      • @Aacini in many languages, the order of attributes does not matter. I understand the documentation for findstr lists /D first. Yes I have no argument with the feature being documented, it just is not documented about the gotcha that the order of attributes matters. I do very little commandline work, so when I was cobbling a command, not aware the order made a difference, I was just adding the attributes as I got to them (and alphabetically, C precedes D). I was getting really frustrated and have shared my "found" experience for anyone else who doesn't work much with commandline.
        – gordon
        CommentedMar 22, 2017 at 21:11
      • 3
        The order of optional attributes usually don't matters. The findstr documentation specify that strings part is NOT optional and that you must place it after the optional attributes and before the optional filename list. If "your found" is that using a command without follow its usage format cause an error, then such a point is well documented. See Command syntax: "Syntax appears in the order in which you must type a command and any parameters that follow it"
        – Aacini
        CommentedMar 22, 2017 at 22:13

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.