Skip to main content

Questions tagged [language-model]

Language models are used extensively in Natural Language Processing (NLP) and are probability distributions over a sequence of words or terms.

2votes
0answers
15views

Evaluation of token importance attribution based on human rationales

I am working on evaluating an explainability method for a text classification model that predicts whether a given text sequence contains hate speech or not. The method outputs token-level importance ...
Marc's user avatar
0votes
0answers
59views

How much improvement does OpenAI o1 achieve from the chain of thought?

https://openai.com/index/learning-to-reason-with-llms/ OpenAI o1 also add more data than the last version of LLM.
CoderOnly's user avatar
0votes
0answers
64views

For image+text, how is pre-training of Multimodal LLM generally done?

For image+text without video, how is pre-training of Multimodal Large Language Model generally done? Choice-1: Transform image to text, and then input all the text to LLM? Choice-2: Transform image to ...
CoderOnly's user avatar
0votes
0answers
43views

Generating transaction data for a dataset to train on

My project is to predict what payment option a customer might use depending on various factors on a checkout screen. For example here are some of the fields I would have Variables : User_Location ...
Naeem Mujeeb's user avatar
0votes
0answers
9views

What are the key quality metrics for large language model releases?

I am a first year PhD student working on improving the release practices of Machine Learning Models, especially pre-trained large language models. I want to understand the above concept for a ...
Eyinlojuoluwa's user avatar
0votes
0answers
14views

What is query generation re-ranking method?

I am reading up on reranking methodologies that leverage LLMs. Relevant literature. One of the methods suggested is query generation Or, the same methodology from another source The task is to rank ...
figs_and_nuts's user avatar
0votes
0answers
28views

How to find out that a conversation with a chatbot is likely ended

I'm working on a ChatBot with Python and langchain, and I'd like to have a metric that I could use to understand how close we ...
user163273's user avatar
1vote
1answer
65views

Callback handlers in Langchain

This might be an odd question, but why is there two codes for the class BaseCallbackHandler? https://api.python.langchain.com/en/latest/_modules/langchain_core/callbacks/base.html#BaseCallbackHandler ...
Justin Jonany's user avatar
0votes
1answer
52views

What languages llama2 supports?

Which languages llama2 supports? I looked at the docs and huggingface but I couldn't find a list. Just it says usage in other languages than English as out-of-scope.
heyula's user avatar
0votes
1answer
50views

How can I get the list of pretrained large language models?

Is there any place I can get the list of pre-trained large language models in a neat way? Despite the most common ones like gpt, BARD, llama2, which llm do you suggest that can be used for RAG and ...
heyula's user avatar
0votes
1answer
77views

How to check the license of a LLM for specific use?

How to check if a large language model has a license allowing to fine tune the model and then publish it publicly? How can I be sure that I can use and fine-tune a large language model without ...
heyula's user avatar
0votes
2answers
67views

How to choose ideal pretrained model for fine-tuning?

I started to work with LLMs lately and want to know how people choose their pre-trained models in their fine-tuning tasks? What is the criteria to choose the base model and which factors affect?
heyula's user avatar
0votes
1answer
43views

Is Machine Reading Comprehension (MRC) outdated?

I recently went through some litterature about knowledge-enhanced language models and found connections with the Machine Reading Comprehension (MRC) task. However, I couldn't find papers more recent ...
Barbara Gendron's user avatar
1vote
1answer
613views

How can I leverage machine learning for log analysis?

I am new to data science and trying to find possibilities of using datascience in tasks. I have a set of logs which I want to convert to json. The logs are more or less of same format and I can write ...
SUNITA GUPTA's user avatar
0votes
1answer
186views

Purely extractive Language Model

Given an email thread, I am trying to extract the body of the most recent email. I used to do that with rules. Now I am testing Large Language Models (LLM) to see if I they provide a less ad hoc ...
mirix's user avatar

153050per page
close