:orphan:

..
    _Auto-generated file, do not edit manually ...
    _Toolbox generate command: repo generate_toolbox_rst_documentation
    _ Source component: Fine_Tuning.run_fine_tuning_job


fine_tuning run_fine_tuning_job
===============================

Run a simple fine-tuning Job.


Parameters
----------


``name``  

* The name of the fine-tuning job to create


``namespace``  

* The name of the namespace where the scheduler load will be generated


``pvc_name``  

* The name of the PVC where the model and dataset are stored


``workload``  

* The name of the workload to run inside the container (fms or ilab)


``model_name``  

* The name of the model to use inside the /dataset directory of the PVC


``dataset_name``  

* The name of the dataset to use inside the /model directory of the PVC


``dataset_replication``  

* Number of replications of the dataset to use, to artificially extend or reduce the fine-tuning effort

* default value: ``1``


``dataset_transform``  

* Name of the transformation to apply to the dataset


``dataset_prefer_cache``  

* If True, and the dataset has to be transformed/duplicated, save and/or load it from the PVC

* default value: ``True``


``dataset_prepare_cache_only``  

* If True, only prepare the dataset cache file and do not run the fine-tuning.


``dataset_response_template``  

* The delimiter marking the beginning of the response in the dataset samples


``container_image``  

* The image to use for the fine-tuning container

* default value: ``quay.io/modh/fms-hf-tuning:release-7a8ff0f4114ba43398d34fd976f6b17bb1f665f3``


``gpu``  

* The number of GPUs to request for the fine-tuning job


``memory``  

* The number of RAM gigs to request for to the fine-tuning job (in Gigs)

* default value: ``10``


``cpu``  

* The number of CPU cores to request for the fine-tuning job (in cores)

* default value: ``1``


``request_equals_limits``  

* If True, sets the 'limits' of the job with the same value as the request.


``prepare_only``  

* If True, only prepare the environment but do not run the fine-tuning job.


``delete_other``  

* If True, delete the other PyTorchJobs before running


``pod_count``  

* Number of Pods to include in the job

* default value: ``1``


``hyper_parameters``  

* Dictionnary of hyper-parameters to pass to sft-trainer


``capture_artifacts``  

* If enabled, captures the artifacts that will help post-mortem analyses

* default value: ``True``


``sleep_forever``  

* If true, sleeps forever instead of running the fine-tuning command.


``ephemeral_output_pvc_size``  

* If a size (with units) is passed, use an ephemeral volume claim for storing the fine-tuning output. Otherwise, use an emptyDir.


``use_roce``  

* If enabled, activates the flags required to use RoCE fast network