Questions tagged [language-model]
Language models are used extensively in Natural Language Processing (NLP) and are probability distributions over a sequence of words or terms.
154 questions
2votes
0answers
15views
Evaluation of token importance attribution based on human rationales
I am working on evaluating an explainability method for a text classification model that predicts whether a given text sequence contains hate speech or not. The method outputs token-level importance ...
0votes
0answers
59views
How much improvement does OpenAI o1 achieve from the chain of thought?
https://openai.com/index/learning-to-reason-with-llms/ OpenAI o1 also add more data than the last version of LLM.
0votes
0answers
64views
For image+text, how is pre-training of Multimodal LLM generally done?
For image+text without video, how is pre-training of Multimodal Large Language Model generally done? Choice-1: Transform image to text, and then input all the text to LLM? Choice-2: Transform image to ...
0votes
0answers
43views
Generating transaction data for a dataset to train on
My project is to predict what payment option a customer might use depending on various factors on a checkout screen. For example here are some of the fields I would have Variables : User_Location ...
0votes
0answers
9views
What are the key quality metrics for large language model releases?
I am a first year PhD student working on improving the release practices of Machine Learning Models, especially pre-trained large language models. I want to understand the above concept for a ...
0votes
0answers
14views
What is query generation re-ranking method?
I am reading up on reranking methodologies that leverage LLMs. Relevant literature. One of the methods suggested is query generation Or, the same methodology from another source The task is to rank ...
0votes
0answers
28views
How to find out that a conversation with a chatbot is likely ended
I'm working on a ChatBot with Python and langchain, and I'd like to have a metric that I could use to understand how close we ...
1vote
1answer
65views
Callback handlers in Langchain
This might be an odd question, but why is there two codes for the class BaseCallbackHandler? https://api.python.langchain.com/en/latest/_modules/langchain_core/callbacks/base.html#BaseCallbackHandler ...
0votes
1answer
52views
What languages llama2 supports?
Which languages llama2 supports? I looked at the docs and huggingface but I couldn't find a list. Just it says usage in other languages than English as out-of-scope.
0votes
1answer
50views
How can I get the list of pretrained large language models?
Is there any place I can get the list of pre-trained large language models in a neat way? Despite the most common ones like gpt, BARD, llama2, which llm do you suggest that can be used for RAG and ...
0votes
1answer
77views
How to check the license of a LLM for specific use?
How to check if a large language model has a license allowing to fine tune the model and then publish it publicly? How can I be sure that I can use and fine-tune a large language model without ...
0votes
2answers
67views
How to choose ideal pretrained model for fine-tuning?
I started to work with LLMs lately and want to know how people choose their pre-trained models in their fine-tuning tasks? What is the criteria to choose the base model and which factors affect?
0votes
1answer
43views
Is Machine Reading Comprehension (MRC) outdated?
I recently went through some litterature about knowledge-enhanced language models and found connections with the Machine Reading Comprehension (MRC) task. However, I couldn't find papers more recent ...
1vote
1answer
613views
How can I leverage machine learning for log analysis?
I am new to data science and trying to find possibilities of using datascience in tasks. I have a set of logs which I want to convert to json. The logs are more or less of same format and I can write ...
0votes
1answer
186views
Purely extractive Language Model
Given an email thread, I am trying to extract the body of the most recent email. I used to do that with rules. Now I am testing Large Language Models (LLM) to see if I they provide a less ad hoc ...