Skip to main content

Questions tagged [string-matching]

1vote
1answer
157views

Data structure for grouping strings in a collection when they share common substrings [closed]

I am looking for a data structure and an algorithm to manage a dynamic collection of strings, but grouping strings that have a substring in common. I try to describe it through an example. @Christophe:...
differentrain's user avatar
-1votes
3answers
351views

Is there a text distance (or string similarity) algorithm which accounts for the distance between characters?

I'm interested in finding a text distance (or string similarity) algorithm which computes a greater distance (or lower similarity) when characters are further apart. For example, I want the distance ...
Vermillion's user avatar
2votes
2answers
911views

Are there typo-tolerance algorithms (as opposed to string similarity)? [closed]

I want to build a search with basic typo tolerance. There are quite a few string similarity algorithms (and implementations for almost all languages I guess). However, humans tend to make some typos ...
cis's user avatar
  • 255
2votes
0answers
237views

Algorithm to search very long blacklist in another very long data set

I have two data sets. The first data set has approx. 50.000 movie and song titles and the second one have 20.000 blacklist strings. I am looking for the best algorithm to detect movie/song title which ...
Eray's user avatar
  • 336
0votes
1answer
241views

Data Matching In VBA - Best way to deal with dynamic data and user entry?

Background I am currently building this project with VBA, just to keep in the back of your mind when thinking about my question. Imagine 2 adjacent blocks, in Excel. The first block is made up of ...
Ekko's user avatar
1vote
0answers
213views

Name matching in SWIFT messages

Here i am basically looking for performance improvement. I need to match names in a SWIFT message (Let's say MT 103) against sanctions lists (sanctions lists by UN, by OFAC, some custom lists) and ...
bjan's user avatar
  • 229
2votes
3answers
2kviews

Algorithm for optimizing text compression

I am looking for text compression algorithms (natural language compression, rather than compression of arbitrary binary data). I have seen for example An Efficient Compression Code for Text ...
Lance Pollard's user avatar
1vote
0answers
195views

phonetic algorithms for words that aren't surnames?

I've been doing a little research into algorithms for matching spelling mistakes in names. From Soundex through to metaphone and Beider-Morse. All of these algorithms generally focus on last names ...
Jarede's user avatar
1vote
1answer
169views

Find a string in list of strings

Background: I am writing an application for a small embedded device. There is a static list of strings: currently about 500 strings and string length is 12 characters on average. The list might ...
psy's user avatar
  • 137
2votes
4answers
3kviews

What is the optimal way to perform 5000 unique string replace functions in terms of performance?

Restructuring some code, and the way I built it up over time has portions that look something like this: s.replace("ABW"," Aruba "); s.replace("AFG"," Afghanistan "); s.replace("AGO"," Angola "); s....
Anon's user avatar
  • 3,633
2votes
1answer
4kviews

Efficient multiple substrings search

I have many substrings(2-5 words each) which I would like to search in some text of about 40-50 words length. What is the most efficient way to flag matching substrings. Currently I am simply using: ...
skadoosh's user avatar
6votes
2answers
4kviews

Detecting plagiarism – what algorithm?

I'm currently writing a program to read a body of text and compare it to search-engine results (from searching for substrings of the given text), with the goal of detecting plagiarism in, for example, ...
Vivian's user avatar
-6votes
2answers
341views

Which piece of code is more efficient with respect to Time and Memory cost? [closed]

Code 1: private static int myCompare(String a, String b) { /* my version of the compareTo method from the String Java class */ int len1 = a.length(); int len2 = b.length(); if (...
Avid Programmer's user avatar
37votes
7answers
51kviews

What algorithm would you best use for string similarity?

I am designing a plugin to uniquely identify content on various web pages, based on addresses. So I may have one address which looks like: 1 someawesome street, anytown, F100 211 later I may find ...
Squiggs.'s user avatar
3votes
3answers
139views

Replace strings based on substring match

I have N strings and M search-replace pairs. Each of the strings contains exactly one of the search pair and the whole string needs to be replaced by the replace pair. Say you have returns,between,...
chx's user avatar
  • 373

153050per page
close