Skip to main content

Questions tagged [randomized-algorithms]

0votes
1answer
680views

Is it mandatory to set a random_state when using RandomizedSearchCV?

When I use RandomizedSearchCV, if I put the random state I always obtain the same results with the same hyperparams trainer. So, is it mandatory to use? Because in my opinion it is better to always ...
Flavio Brienza's user avatar
1vote
1answer
169views

Clustering by using Locality sensitive hashing *after* Random projection

It is well known that Random Projection (RP) is tightly linked to Locality Sensitive Hashing (LSH). My goal is to cluster a large number of points lying in a d-dimensional Euclidean space, where $d$ ...
Penelope Benenati's user avatar
1vote
1answer
302views

Grid Searching seed in randomized machine learning

I was wondering if tuning a seed with cross-validation in order to maximize the performance of an algorithm heavily based on a randomness factor is a good idea or not. I have created an Extra Tree ...
Jonathan's user avatar
1vote
0answers
14views

Create a random chi-Square independence distribution with a given p-Value

I want to randomly create a table of data that has a predefined p-Value and chi-Value of a chi-square distribution. For example this would have a p-Value of 1 on a chi-square independence test: ...
Cowboy_Patrick's user avatar
0votes
1answer
515views

What is the objective that is optimized with Random Search?

I have recently learned about Random Search (or sklearn.model_selection.RandomizedSearchCV in Python) and was thinking about the theory behind the optimization process. In particular my question is, ...
RazorLazor's user avatar
-2votes
1answer
883views

Is shuffling data really necessary for training? [duplicate]

I don't mean if we had a dataset where if sequentially sampled, the labels would be [1111122223333]. In this case, the network learns to predict everything as 1, then 2, and so on and it's impossible ...
user95039's user avatar
1vote
1answer
2kviews

How to compute modulo of a hash?

Let's say that I have a set of users in my database, that have GUIDs as their IDs. I use xxhash to generate fixed-length hashes for each value, so that I can then ...
Den's user avatar
  • 113
3votes
3answers
4kviews

Cannot clone object <keras.wrappers.scikit_learn.KerasRegressor object at 0x7fdc9c3ba550>

Trying to hypertune ANN but getting an error while using fit..(grid1.fit(X_train, y_train)) Below is the code ...
Ruchika Sancheti's user avatar
0votes
1answer
281views

RL Sutton book, initial estimate of q*(a) for 10 arm testbed

The Sutton book does not mention what the initial estimate is for q*(a) before the first reward is received. In this code repo that seems to go along with the book: Sutton code repo They have ...
mLstudent33's user avatar
0votes
1answer
945views

How to generate 12 independent random weights which all add up to one

I'm using Palisade's @Risk software with a triangular distribution to generate 12 random weights which must add up to one, but I get a lot of negative numbers. Is there a straightforward way to set ...
Angus's user avatar
4votes
1answer
298views

Why would one crossvalidate the random state number?

Still learning about machine learning, I've stumbled across a kaggle (link), which I cannot understand. Here are lines 72 and 73: ...
Dan Chaltiel's user avatar
10votes
3answers
9kviews

Splitting train/test sets by an identifier?

I know sklearn has train_test_split() to split a train and test set. But I read that, even with setting a random seed, if your actual dataset is updated regularly, ...
Greg Rosen's user avatar
11votes
2answers
2kviews

What is the most efficient method for hyperparameter optimization in scikit-learn?

An overview of the hyperparameter optimization process in scikit-learn is here. Exhaustive grid search will find the optimal set of hyperparameters for a model. The downside is that exhaustive grid ...
Brian Spiering's user avatar
1vote
0answers
42views

how to label a tain_data? [closed]

I have one assignment that I have four files 1) train_data.csv: The training file contains two fields (text, id). 2) train_label.csv: The label file contains two fields (id, label). 3) test_data.csv: ...
Mukesh Bhandarkar's user avatar
5votes
2answers
7kviews

How to choose the random seed?

I understand this question can be strange, but how do I pick the final random_seed for my classifier? Below is an example code. It uses the ...
Bruno Lubascher's user avatar

153050per page
close