Skip to main content

Questions tagged [dataset]

A dataset is a collection of data, often in tabular or matrix form. This tag is NOT intended for data requests ("where can I find a dataset about ...") --> see OpenData

0votes
0answers
14views

Neural Networks for Matrix Inversion

For my Bachelor's thesis, I am working on a project named "Neural Networks for Matrix Inversion" where deep learning methods are used to compute the inverse of a matrix in comparison to ...
Arda Bulbul's user avatar
3votes
0answers
18views

fetching data for causal AI

How to get data for causal AI ? Any resource you can suggest ,And do we need to split data (train , test ) for causal AI also ?
quanity's user avatar
5votes
1answer
36views

Analyzing if my email notifications increase or decrease total subscriptions

I am hoping to reach someone who knows how to interpret data, if not, someone with better logic than me would still help :) I had around 9000 users paying for monthly subscriptions for a service on ...
adrianTNT's user avatar
2votes
1answer
19views

MTEB/MMTEB: dataset and metric to determine threshold for pair classification task

I'm trying to locally replicate the pair classification task of MMTEB/MTEB. However, I didn't find train/dev sets for all datasets in this task. Table 2 in the original MTEB paper (Mueninghoff et al, ...
Jonathan's user avatar
0votes
0answers
59views

What do the edges represent in the "Gnutella Peer-to-Peer Network, August 8, 2002" dataset?

I was tasked with analysing the "Gnutella peer-to-peer network, August 8 2002" dataset and given only very limited information about it. In particular, I want to figure out what exactly the ...
jonupp's user avatar
0votes
0answers
14views

How to optimize similarity calculation performance for a 350k-record dataset?

I'm trying to pre-compute similarity scores for a 350k record dataset, but calculations are very slow due to the number of categories, time similarity calculation, and string similarity processing. I'...
DatCra's user avatar
1vote
0answers
22views

Is one dataset with many images of the same person acceptable?

I am currently using a CNN for face detection. I plan to use open datasets to pre-train one neural network and fine-tune the neural network using images captured by my camera. The open datasets are ...
Jogging Song's user avatar
0votes
0answers
16views

I would like to build an open source Traffic Signs Dataset solely for research purposes

I've been interested lately in doing research about different neural networks and how to contribute to Autonomous Vehicles, I used a couple of images to train a model and the results were different ...
Amy's user avatar
1vote
0answers
31views

Looking for Datasets for Training a 2D Virtual Try-On Model (TryOnDiffusion)

I'm currently working on training a 2D virtual try-on model, specifically something along the lines of TryOnDiffusion, and I'm looking for datasets that can be used for this purpose. Does anyone know ...
Nico's user avatar
1vote
1answer
14views

About YOLO format two computer vision data set combine

i am currently trying to combine two data set and get one large data set that to fine tune YOLOv8 model. as a example is there any way to combine same YOLO format image data set combine together/ as ...
Mahinda Rajapaksha's user avatar
0votes
0answers
6views

Dataset suggestion

I have started working on a Data Science project, and I am planning on building a conversational agent for my project. For that, I need at least three datasets where I will merge relevant columns with ...
Kuppa Sundararajan Ramya Laksh's user avatar
1vote
0answers
9views

Help with BQ table

I am trying to build a partition table populated with data from shareded tables. So, in this partition table, I'll unnest some variables, in order to have the data more organized. My question is: in ...
PTaq's user avatar
8votes
3answers
90views

Dataset cleaning and balancing

I have a dataset that measures earthquake magnitude to dispersion of movement from epicenter. As an example you have on richter scale 1 to 10 to net movement in cm. The nature of the dataset is such ...
ChairmanMeow's user avatar
1vote
0answers
17views

Cleaning Inconsistently Reported Data Over Time

Let's assume I have a group of friends who, at the start of the new year, have all agreed to text me the total minutes they've exercised that day. I collect that data over the course of the year, and ...
GoodAnalysis's user avatar
2votes
0answers
39views

Low Dice Score (0.40) in satellite image Segmentation using UNET(Buildings)

I am working on a building roof segmentation task using satellite images, but I am struggling to improve my Dice loss from 0.40. I have tried multiple approaches including: Different U-Net variants (...
Nihar's user avatar

153050per page
close