Questions tagged [dataset]
A dataset is a collection of data, often in tabular or matrix form. This tag is NOT intended for data requests ("where can I find a dataset about ...") --> see OpenData
1,508 questions
0votes
0answers
14views
Neural Networks for Matrix Inversion
For my Bachelor's thesis, I am working on a project named "Neural Networks for Matrix Inversion" where deep learning methods are used to compute the inverse of a matrix in comparison to ...
3votes
0answers
18views
fetching data for causal AI
How to get data for causal AI ? Any resource you can suggest ,And do we need to split data (train , test ) for causal AI also ?
5votes
1answer
36views
Analyzing if my email notifications increase or decrease total subscriptions
I am hoping to reach someone who knows how to interpret data, if not, someone with better logic than me would still help :) I had around 9000 users paying for monthly subscriptions for a service on ...
2votes
1answer
19views
MTEB/MMTEB: dataset and metric to determine threshold for pair classification task
I'm trying to locally replicate the pair classification task of MMTEB/MTEB. However, I didn't find train/dev sets for all datasets in this task. Table 2 in the original MTEB paper (Mueninghoff et al, ...
0votes
0answers
59views
What do the edges represent in the "Gnutella Peer-to-Peer Network, August 8, 2002" dataset?
I was tasked with analysing the "Gnutella peer-to-peer network, August 8 2002" dataset and given only very limited information about it. In particular, I want to figure out what exactly the ...
0votes
0answers
14views
How to optimize similarity calculation performance for a 350k-record dataset?
I'm trying to pre-compute similarity scores for a 350k record dataset, but calculations are very slow due to the number of categories, time similarity calculation, and string similarity processing. I'...
1vote
0answers
22views
Is one dataset with many images of the same person acceptable?
I am currently using a CNN for face detection. I plan to use open datasets to pre-train one neural network and fine-tune the neural network using images captured by my camera. The open datasets are ...
0votes
0answers
16views
I would like to build an open source Traffic Signs Dataset solely for research purposes
I've been interested lately in doing research about different neural networks and how to contribute to Autonomous Vehicles, I used a couple of images to train a model and the results were different ...
1vote
0answers
31views
Looking for Datasets for Training a 2D Virtual Try-On Model (TryOnDiffusion)
I'm currently working on training a 2D virtual try-on model, specifically something along the lines of TryOnDiffusion, and I'm looking for datasets that can be used for this purpose. Does anyone know ...
1vote
1answer
14views
About YOLO format two computer vision data set combine
i am currently trying to combine two data set and get one large data set that to fine tune YOLOv8 model. as a example is there any way to combine same YOLO format image data set combine together/ as ...
0votes
0answers
6views
Dataset suggestion
I have started working on a Data Science project, and I am planning on building a conversational agent for my project. For that, I need at least three datasets where I will merge relevant columns with ...
1vote
0answers
9views
Help with BQ table
I am trying to build a partition table populated with data from shareded tables. So, in this partition table, I'll unnest some variables, in order to have the data more organized. My question is: in ...
8votes
3answers
90views
Dataset cleaning and balancing
I have a dataset that measures earthquake magnitude to dispersion of movement from epicenter. As an example you have on richter scale 1 to 10 to net movement in cm. The nature of the dataset is such ...
1vote
0answers
17views
Cleaning Inconsistently Reported Data Over Time
Let's assume I have a group of friends who, at the start of the new year, have all agreed to text me the total minutes they've exercised that day. I collect that data over the course of the year, and ...
2votes
0answers
39views
Low Dice Score (0.40) in satellite image Segmentation using UNET(Buildings)
I am working on a building roof segmentation task using satellite images, but I am struggling to improve my Dice loss from 0.40. I have tried multiple approaches including: Different U-Net variants (...