Models for Non-LLM translations
MBARTTranslator
MBART model for translation. It supports many-to-many translation across multiple languages and is locally usable.
__init__(target_lang, source_lang=None, device='cuda' if torch.cuda.is_available() else 'cpu', max_length=512, num_beams=4, tokenizer_kwargs=None, model_kwargs=None)
Initializes the MBARTTranslator.
All parameters are used to configure the underlying Hugging Face model and tokenizer
as defined in the HuggingFaceTranslator
base class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target_lang
|
str
|
The target language code for translation (e.g., 'de_DE' for German,
using MBART specific codes if applicable, or generic codes if |
required |
source_lang
|
Optional[str]
|
The source language code for translation (e.g., 'en_XX' for English). If not provided, the MBART model's default behavior for source language detection applies. |
None
|
device
|
Optional[Union[str, device]]
|
The device (e.g., "cpu", "cuda", "mps") on which the MBART model and tokenizer will be loaded. Defaults to "cuda" if a CUDA-enabled GPU is available, otherwise "cpu". |
'cuda' if is_available() else 'cpu'
|
max_length
|
Optional[int]
|
The maximum sequence length for generated translations by MBART. Defaults to 512. |
512
|
num_beams
|
Optional[int]
|
The number of beams for beam search decoding with MBART. Defaults to 4. |
4
|
tokenizer_kwargs
|
Optional[Dict[str, Any]]
|
Additional keyword arguments for the MBART tokenizer. Defaults to None. |
None
|
model_kwargs
|
Optional[Dict[str, Any]]
|
Additional keyword arguments for the MBART model. Defaults to None. |
None
|
detect_language(text)
Detects the language of the given text using langdetect.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The text whose language is to be detected. |
required |
Returns:
Type | Description |
---|---|
str
|
The detected language code (e.g., 'en', 'fr'). |
Raises:
Type | Description |
---|---|
ValueError
|
If the text is empty or invalid for detection. |
LangDetectException
|
If language detection by the |
ValueError
|
If the detected language is not in |
translate(text)
Translate the input text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The text to translate. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The translated text. |
translate_batch(texts)
Translate a batch of texts from source language to target language.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
texts
|
list
|
A list of texts to be translated. |
required |
Returns:
Name | Type | Description |
---|---|---|
list |
list
|
A list of translated texts. |