Model Implementation
BaseTranslator
Abstract base class for translator models. Ensures a consistent interface for all translators, whether local LLM-based or via an API.
translate(text)
abstractmethod
Translate a single piece of text from the source language to the target language.
MBartTranslator
Translator using Hugging Face’s mBART-50 model for multilingual translation.
__init__(source_lang, target_lang, device='cpu', max_length=512, num_beams=4, tokenizer_kwargs=None, model_kwargs=None)
Initialize the mBART-50 translator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source_lang
|
str
|
Source language code (e.g. "en_XX"). |
required |
target_lang
|
str
|
Target language code (e.g. "de_DE"). |
required |
device
|
Union[str, device]
|
"cpu", "cuda", or a torch.device. Defaults to "cpu". |
'cpu'
|
max_length
|
int
|
Maximum length of generated sequences. Defaults to 512. |
512
|
num_beams
|
int
|
Number of beams for beam‐search. Defaults to 4. |
4
|
tokenizer_kwargs
|
Optional[Dict[str, Any]]
|
Extra kwargs for
|
None
|
model_kwargs
|
Optional[Dict[str, Any]]
|
Extra kwargs for
|
None
|
Raises:
Type | Description |
---|---|
ValueError
|
If source_lang or target_lang are empty, or if max_length or num_beams are not positive. |
translate(text)
Translate a single sentence using the mBART-50 model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The input sentence in the source language. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
Translated sentence. |
Raises:
Type | Description |
---|---|
TranslationError
|
If |
LLMTranslator
Translator that drives any Ollama‑hosted model via the Ollama Python client.
__init__(model_name='llama3.1:8b', num_predict=512, source_lang='English', target_lang='German', stop=None, client=None, prompt_template=None)
Initialize an Ollama-based translator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name
|
str
|
Ollama model ID (e.g. "llama3.1:8b"). |
'llama3.1:8b'
|
num_predict
|
int
|
Maximum number of tokens to predict per call. |
512
|
source_lang
|
str
|
Name of the source language. |
'English'
|
target_lang
|
str
|
Name of the target language. |
'German'
|
stop
|
Optional[List[str]]
|
Optional list of stop sequences; defaults to ["—"]. |
None
|
client
|
Optional[Client]
|
Optional pre-configured Ollama Client; if None, constructs a new one. |
None
|
prompt_template
|
Optional[str]
|
Optional prompt template with placeholders {source_lang}, {target_lang}, {text}. If None uses DEFAULT_TEMPLATE. |
None
|
Raises:
Type | Description |
---|---|
ValueError
|
If model_name, source_lang, or target_lang are empty, or if num_predict is not positive. |
translate(text)
Translate the given text via the Ollama LLM.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
A single sentence to translate. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The translated sentence (stripped of surrounding whitespace). |
Raises:
Type | Description |
---|---|
TranslationError
|
On any failure from the Ollama client. |