5
\$\begingroup\$

The problem I want to solve is very simple:

Given a string s and a string t, check if s is subsequence of t. there is only lower case English letters in both s and t. t is potentially a very long (length ~= 500,000) string, and s is a short string (<=100).

Example 1: s = "abc", t = "ahbgdc" Return true.

Example 2: s = "axc", t = "ahbgdc" Return false.

I only learnt C++ for about 30 hours, so I am not sure whether my below solution is correct C++ style or not, especially when to free the pointer allocated space. I assigned two char pointer to each string and I do know I need to free them somewhere else. Since the return value depends on pointer, I don't know where to free the pointer allocated memory, which should be a big deal.

What I thought about this question is that there must be two pointers, each pointing one string. For these two strings, using pointers to compare character one by one.

#include <iostream> #include <string.h> using namespace std; class Solution { public: bool isSubsequence(string s, string t) { char *sPtr = new char[s.size() + 1]; char *tPtr = new char[t.size() + 1]; bool returnVal = false; strcpy(sPtr, s.c_str()); strcpy(tPtr, t.c_str()); while (*sPtr && *tPtr){ if (*sPtr == *tPtr++) { sPtr++; } } return !*sPtr; } }; int main() { string s = "abc"; string t = "ahbgdc"; Solution sol = Solution(); cout << sol.isSubsequence(s, t)<<endl; return 0; } 

After I talked to one of classmates, he suggests that to include and use c++ string class to do this question. Following is my solution followed his suggestion, but for my above solution, he just said "I don't know dude, I just use c++ string class, that's simple. Your solution is just weird to me but I don't know why I feel weird."

class Solution { public: bool isSubsequence(string s, string t) { int sIndex = 0; int tIndex = 0; while (sIndex < s.size() && tIndex < t.size()) { if (s[sIndex] == t[tIndex++]) { sIndex++; } } return sIndex >= s.size(); } }; 
\$\endgroup\$
1
  • \$\begingroup\$Subsequence of contiguous subsequence? Your example does not clarify that. Is "adf" a subsequence of "abcdef"?\$\endgroup\$
    – einpoklum
    CommentedFeb 18, 2020 at 23:05

2 Answers 2

3
\$\begingroup\$

Let's start with your solution, it feels weird, because it is. You mixed C and C++, and on top of that you even refuted back to C string syntax and pointer arithmetic. Especially in plain C, string handling is rather easy to break, even more so if the string didn't come from C to begin with.

In fact, your implementation does not behave correctly in all situations. An std::string can also contain null-bytes '\0' as regular characters. The very same null bytes which double as sentinels when attempting to interpret as a C string.

Well, the assignment did guarantee you that only latin characters would appear in either input, ruling this case out. It's still a possibility you should be aware of though.


char *sPtr = new char[s.size() + 1]; sPtr++; 

This is a huge problem. Even if you wanted to delete that char[] again, you wouldn't be able to. You just threw away your only reference you had.

If you would call delete[] sPtr; now, your application would just crash or worse, because you must call delete on the same pointer as returned by new.

Even if you were to write this in C, you must always keep a pointer to the base address of the dynamically allocated memory region, or you won't be able to free the memory again.

Even though it wouldn't had been necessary to allocate a new char[] anyway. As you don't modify the data, but only the pointer, you could just have said char const *sPtr = s.c_str(); in which case you wouldn't have needed to deal with memory allocation at all.


Your classmates solution is better, but not perfect. While he realized that you can just access an std::string as if it was an array, and avoided the handling of '\0', going via indexes is usually not the preferred solution in C++. It's just far too easy to introduce an off-by-one error when doing that.

The solution to that is simple, rather than going by plain array indexes, stick to iterators when dealing with C++ code:

class Solution { using std::string; public: bool isSubsequence(const string &s, const string &t) { string::iterator s_it = s.begin(); for (string::iterator t_it = t.begin(); s_it != s.end() && t_it != t.end(); ++t_it) { if (*s_it == *t_it) { ++s_it; } } return s_it == s.end(); } }; 

Another benefit of working with iterators rather than array indexes is that this solution also works with other collection types, not just strings or similar array backed structures.

In this case, I e.g. only used the properties of an InputIterator which makes the very same algorithm applicable to a wide range of data structures, including not only std::string, std::vector or std::list, but if required even input streams and alike.

If you like to, you can try to rewrite the solution so that instead of expecting for t to be passed as a string, it just accepts an arbitrary pair of iterators replacing t.begin() and t.end(). These may e.g. be obtained from std::cin or an open file handle. Have a look at the documentation of std::istream_iterator for that.

\$\endgroup\$
0
    9
    \$\begingroup\$

    using namespace std is considered bad practice. Also, in C++, you should #include <string> rather than #include <string.h>.

    The Solution class is an unnecessary complication. You would be better off with just an isSubsequence() function.

    In your first function, returnVal is never used. Compiling with warnings enabled should alert you to the dead code.

    The memory management rule is simple: every new must be paired with exactly one delete, and every new something[] must be paired with exactly one delete[]. Since the first solution calls new[] twice with no delete[], you have a memory leak. Your friend is right: you should take advantage of the string class, because the whole point of the string class is to take care of memory management for you. The second solution is definitely better.

    The function signature is not ideal. Having string parameters will cause each string to be needlessly copied when the function is called. You should write instead:

    bool isSubsequence(const std::string &s, const std::string &t) { … } 

    std::string::size() returns std::string::size_type, so strictly speaking, you should use std::string::size_type instead of int. (Or, if using , just use auto.)

    \$\endgroup\$
    1
    • \$\begingroup\$Probably worth noting that although the solution is weird for C++, it isn't that weird for C (well, mixing strings and a memory leak such and such is kind of weird, but I would expect to see an approach similar to his). Illustrates how the two languages can be very different.\$\endgroup\$
      – Dair
      CommentedOct 23, 2016 at 4:25

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.