dlt.common.libs.git
is_clean_and_synced
def is_clean_and_synced(repo: Repo) -> bool
Checks if repo is clean and synced with origin
ensure_remote_head
def ensure_remote_head(repo_path: str,
branch: Optional[str] = None,
with_git_command: Optional[str] = None,
path: Optional[str] = None) -> None
Updates repository from origin and ensures it's clean and synced.
Uses sparse checkout when path is specified, fetching only the specified directory instead of the entire repository tree.
Arguments:
repo_path- Local path to the git repository.branch- Branch to checkout. Defaults to repository's default branch.with_git_command- Custom GIT_SSH_COMMAND for authentication.path- Directory path for sparse checkout. When set, only this path is checked out, reducing download size and time.
Raises:
RepositoryDirtyError- If repository has uncommitted changes or is not synced with origin.
force_clone_repo
def force_clone_repo(repo_url: str,
repo_storage: FileStorage,
repo_name: str,
branch: Optional[str] = None,
with_git_command: Optional[str] = None,
path: Optional[str] = None) -> None
Deletes existing repository and performs fresh clone.
Removes repo_storage.root/repo_name if it exists, then clones from repo_url. Uses sparse checkout when path is specified to download only the specified directory, reducing clone time and disk usage.
Arguments:
repo_url- Git repository URL to clone from.repo_storage- FileStorage instance managing the clone destination.repo_name- Directory name for the cloned repository.branch- Branch to checkout after cloning.with_git_command- Custom GIT_SSH_COMMAND for authentication.path- Directory path for sparse checkout. When set, only this path is cloned using --filter=blob:none and --depth=1.
get_fresh_repo_files
def get_fresh_repo_files(repo_location: str,
working_dir: str = None,
branch: Optional[str] = None,
with_git_command: Optional[str] = None,
path: Optional[str] = None) -> FileStorage
Returns FileStorage with up-to-date repository files.
If repo_location is a local directory, returns storage pointing to it. If it's a git URL, clones or updates the repository in working_dir/repo_name. Supports sparse checkout to fetch only a specific directory path.
Arguments:
repo_location- Local directory path or git repository URL.working_dir- Directory where repository will be cloned if repo_location is URL.branch- Branch to checkout.with_git_command- Custom GIT_SSH_COMMAND for authentication.path- Directory path for sparse checkout. Downloads only this directory, improving performance for large repositories.
Returns:
FileStorage instance pointing to repository files (or specified path within).