Documentation#
- class sortium.sorter.Sorter(file_types_dict: Dict[str, List[str]] = None, file_utils: FileUtils = None)[source]#
Bases:
objectOrganizes files into directories based on various criteria.
Each public method emits a JSON move plan describing every file’s source and destination so you can review, edit, or auto-apply the workflow. The class stays memory-efficient while handling large trees by relying on generators and incremental planning.
- file_types_dict#
A mapping of file category names to lists of associated file extensions.
- Type:
Dict[str, List[str]]
- sort_by_date(folder_path: str, folder_types: List[str], dest_folder_path: str | None = None, plan_output: str | None = None, auto_apply: bool = False, recursive: bool = False) Path[source]#
Generates a plan to sort files within categories by modification date.
Files are moved into date-stamped subfolders (e.g., “01-Jan-2023”). Set
recursiveto pull in files from nested directories within each category.- Parameters:
folder_path – Root directory containing the category folders to process.
folder_types – List of category folder names (e.g., [‘Images’]).
dest_folder_path – Base directory for the sorted folders. Defaults to
folder_pathwhenNone.plan_output – Optional JSON path override for the emitted plan.
auto_apply – If
True, immediately executes the generated plan.recursive – When
True, scans inside nested directories under each category.
- Returns:
Path to the JSON plan file.
- Raises:
FileNotFoundError – If
folder_pathdoes not exist.
- sort_by_extension(folder_path: str, dest_folder_path: str | None = None, ignore_dir: List[str] | None = None, plan_output: str | None = None, auto_apply: bool = False, recursive: bool = True) Path[source]#
Generates a plan to sort files by extension into subdirectories.
- Parameters:
folder_path – Path to the directory containing unsorted files.
dest_folder_path – Base directory for the sorted category folders. Falls back to
folder_pathwhenNone.ignore_dir – Optional directory names to skip when scanning.
plan_output – Optional JSON path override for the emitted plan.
auto_apply – If
True, immediately executes the generated plan.recursive – When
True(default), recursively scans the tree.
- Returns:
Path to the JSON plan file.
- Raises:
FileNotFoundError – If
folder_pathdoes not exist.
- sort_by_regex(folder_path: str, regex: Dict[str, str], dest_folder_path: str, plan_output: str | None = None, auto_apply: bool = False, recursive: bool = True) Path[source]#
Generates a plan to sort files recursively based on regex patterns.
Scans
folder_path(optionally including subdirectories) for files whose names match the provided regex patterns, then moves them to categorized folders withindest_folder_path.- Parameters:
folder_path – Path to the directory to scan recursively.
regex – Dictionary mapping category names to regex patterns.
dest_folder_path – Base directory where sorted files will be moved.
plan_output – Optional JSON path override for the emitted plan.
auto_apply – If
True, immediately executes the generated plan.recursive – When
True(default), recursively scans the folder.
- Returns:
Path to the JSON plan file.
- Raises:
FileNotFoundError – If
folder_pathdoes not exist.RuntimeError – If a critical error occurs while preparing the plan.
- sort_by_type(folder_path: str, dest_folder_path: str | None = None, ignore_dir: List[str] | None = None, plan_output: str | None = None, auto_apply: bool = False, recursive: bool = False) Path[source]#
Generates a plan to sort files into subdirectories by file type.
Files in the top level of
folder_pathare mapped into subdirectories (e.g., “Images”, “Documents”) insidedest_folder_path. Enablerecursiveto scan the entire tree. The plan is written to JSON so it can be inspected or edited before execution.Note
This method is memory-efficient and suitable for sorting directories with a very large number of files.
- Parameters:
folder_path – Path to the directory containing unsorted files.
dest_folder_path – Base directory for the sorted category folders. Falls back to
folder_pathwhenNone.ignore_dir – Optional directory names to skip when scanning.
plan_output – Optional JSON path override for the emitted plan.
auto_apply – If
True, immediately executes the generated plan.recursive – When
True, recursively scans nested folders.
- Returns:
Path to the JSON plan file.
- Raises:
FileNotFoundError – If
folder_pathdoes not exist.
- class sortium.file_utils.FileUtils[source]#
Bases:
objectProvides memory-efficient utilities for file and directory manipulation.
- apply_move_plan(plan_file: str, reverse: bool = False, dry_run: bool = False) Dict[str, int | List[str]][source]#
Applies or reverses a JSON move plan produced by Sorter methods.
- Parameters:
plan_file – Path to the JSON plan file to execute.
reverse – If
True, moves files back to theirsource_path.dry_run – If
True, validates the plan without moving files.
- Returns:
A summary dictionary containing
entries,movedanderrorskeys.- Raises:
FileNotFoundError – If
plan_filedoes not exist.
- export_directory_structure(folder_path: str, output_file: str, ignore_dir: Sequence[str] | None = None) Path[source]#
Writes the directory tree rooted at
folder_pathto a JSON file.- Parameters:
folder_path – Directory whose structure should be traced.
output_file – Destination JSON file path.
ignore_dir – Optional iterable of additional directory or file names to skip alongside
DEFAULT_IGNORE_ENTRIES.
- Returns:
Path to the generated JSON file.
- Raises:
FileNotFoundError – If
folder_pathdoes not exist.NotADirectoryError – If
folder_pathis not a directory.
- find_unique_extensions(source_path: str, ignore_dir: List[str] | None = None) Set[str][source]#
Recursively finds all unique file extensions in a directory.
This method is memory-efficient, scanning the directory tree without loading all paths into memory at once.
- Parameters:
source_path – Path to the root directory to scan.
ignore_dir – Additional directory names to ignore alongside the built-in defaults (
DEFAULT_IGNORE_ENTRIES).
- Returns:
A set of unique file extensions (e.g., {“.txt”, “.jpg”}).
- Raises:
FileNotFoundError – If
source_pathdoes not exist.
- flatten_dir(folder_path: str, dest_folder_path: str, ignore_dir: Sequence[str] | None = None) None[source]#
Moves all files from a directory tree into a single destination folder.
This method recursively finds all files in
folder_pathand moves them todest_folder_path. It does not preserve the original directory structure. It does not delete the original empty folders.Note
This operation runs sequentially and does not remove the original (now empty) subdirectories.
- Parameters:
folder_path – Path to the root folder to flatten.
dest_folder_path – Path to the single folder where all files will be moved.
ignore_dir – Additional directory names to ignore alongside the built-in defaults (
DEFAULT_IGNORE_ENTRIES).
- Raises:
FileNotFoundError – If
folder_pathdoes not exist.
- get_file_modified_date(file_path: str) datetime[source]#
Returns the last modified datetime of a file.
- Parameters:
file_path – Full path to the file.
- Returns:
A datetime object for the last modification time.
- Raises:
FileNotFoundError – If the file does not exist.
- iter_all_files_recursive(folder_path: str, ignore_dir: Sequence[str] | None = None) Generator[Path, None, None][source]#
Recursively yields all files in a directory and its subdirectories.
This is a memory-efficient generator that does not load the entire file list into memory.
- Parameters:
folder_path – Path to the root directory to scan.
ignore_dir – Additional directory names to ignore alongside the built-in defaults (
DEFAULT_IGNORE_ENTRIES).
- Yields:
A generator of
Pathobjects for each file found.
- iter_shallow_files(folder_path: str, ignore_dir: Sequence[str] | None = None) Generator[Path, None, None][source]#
Yields files in the top level of a directory.
This is a non-recursive generator.
- Parameters:
folder_path – Path to the folder to iterate.
ignore_dir – Additional names to ignore alongside the built-in defaults (
DEFAULT_IGNORE_ENTRIES).
- Yields:
A generator of
Pathobjects for each file.
- plan_destination_path(source_path: str, dest_folder_path: str) Path[source]#
Predicts the collision-safe destination path for a file move.
- Parameters:
source_path – Current location of the file.
dest_folder_path – Folder where the file is planned to be moved.
- Returns:
Path of the file at the destination, including any rename that would be required to avoid collisions.
- sortium.config.DEFAULT_FILE_TYPES: Dict[str, List[str]]#
Default file type categories and their associated file extensions.
Used to map file extensions to logical categories during file sorting.
Examples
>>> DEFAULT_FILE_TYPES["Images"] ['.jpg', '.jpeg', '.png', '.gif']
- Categories include:
Images
Documents
Spreadsheets
Presentations
Videos
Music
Archives
Code
Executables
Fonts
Design
Others