Questions tagged [algorithms]
An algorithm is a set of one or more computations that will produce a calculated result. All statistics methods are algorithms. Algorithms can be simple, such as calculating a percentage, or can be very complex and require a computer for fast and accurate results.
405 questions
3votes
1answer
34views
Name of algorithm that maps a string column to a float column, based on an aggregation with another float column , similar to TF-IDF
The Question I'm not super familiar with the name's of common algorithms in Data Science, and I feel like this would be something that is commonly used, and so should have a name - want to refer to ...
4votes
2answers
80views
What ML algorithm to find, considering one local authority, what competencies it has/or not that other ones have mostly choosen to discard/take?
I would like to know to what machine learning algorithm my problem belongs. In France, our 1,200 local authorities have competencies: each one depicts a range of services they are expected to provide ...
2votes
1answer
75views
Looking for a better way to calculate positive rate of combinations' sum
Suppose you have a list of float numbers with a size of 10, and you choose 5 numbers out of such list and sum them up to form a new number, generating all possible combinations now you have a new list ...
4votes
1answer
178views
How to Classify Driving Behaviors (Acceleration, Braking, Turning) Using 2D Coordinates and Velocity?
I'm working on a project to classify driving behaviors based on a vehicle's position and velocity data. For each time step, I have the following information: $x, y $ : Position coordinates in a 2D ...
3votes
1answer
38views
The classifications from the model do not align well with business expectations or the "X" metric
Could this discrepancy be caused by the threshold strategy? If so, how should I optimize or adjust these thresholds? How can I better align the model outputs with the business context of "X"?...
0votes
0answers
13views
Data Scientist Trying to Learn Algorithms and Data Structures from "Introduction to Algorithms" book
I am reading the book "Introduction to Algorithms", 4th edition, by Thomas H. Corman, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein to learn relevant knowledge related to ...
1vote
1answer
85views
Leetcode vs Hackerrank in Algorithm Practice
Has anyone used both Leetcode and Hackerrank to practice algorithms? Which one do you think would hit more interview questions?
2votes
2answers
169views
Why does the regression model produced by XGBoost depend on the order of the training data when more than 8194 data points are used?
When I use XGBRegressor to construct a boosted tree model from 8194 or fewer data points (i.e., n_train $\leq$ 8194, where ...
1vote
0answers
29views
Sampling multiple masked tokens through Metropolis–Hastings
I'm trying to replicate the finding of the the publication "Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis-Hastings" for obtaining the joint distribution ...
1vote
1answer
41views
Guidelines for image recognition model (inventory purposes)
I have 20,000 plus images of art (paintings, sculptures, jars, etc) stored in a data base. The actual pieces are distributed in multiple warehouses. Ideally, the physical pieces SHOULD have a sticker ...
0votes
0answers
39views
How to build a recommendation system Based on user infos and without ratings?
I would like to build a recommendation system based only on user informations(age,sex,zipcode,and some quiz answers),based on those features i want to recommend assurance products, but i am confused ...
0votes
1answer
26views
Unsupervised detection and counting of a pattern in a time signal without pre knowledge
I'm facing a problem and it seems I can't find a working solution since a few years. TLDR : Is there a well-known algorithm or NN-model that is able to autodetect and count patterns in a time signal ...
4votes
1answer
49views
Algorithm for picking N random uniformly distributed samples, in irregular polygon?
Say want to pick a fixed number of samples from a large 2D dataset, such that they relatively evenly distributed over the whole sample area. Imagine places in a country - so the border of the data is ...
1vote
1answer
64views
Find closest color class to an RGB value
I have a module that estimates the color of an object and returns an RGB value in this format: (40, 48, 68) which corresponds to this color: Now I have to classify ...
1vote
1answer
64views
How can I predict the best treatment to give to new patient?
As part of a school project, I have to analyze a dataset with patients (with characteristics: sex, age, smoker 0/1, etc.) who received different treatments (one per patient) with a response to this ...