Documentation#

class sortium.sorter.Sorter(file_types_dict: Dict[str, List[str]] = None, file_utils: FileUtils = None)[source]#

Bases: object

Organizes files into directories based on various criteria.

Each public method emits a JSON move plan describing every file’s source and destination so you can review, edit, or auto-apply the workflow. The class stays memory-efficient while handling large trees by relying on generators and incremental planning.

file_types_dict#

A mapping of file category names to lists of associated file extensions.

Type:

Dict[str, List[str]]

file_utils#

An instance of a file utility class.

Type:

FileUtils

sort_by_date(folder_path: str, folder_types: List[str], dest_folder_path: str | None = None, plan_output: str | None = None, auto_apply: bool = False, recursive: bool = False) Path[source]#

Generates a plan to sort files within categories by modification date.

Files are moved into date-stamped subfolders (e.g., “01-Jan-2023”). Set recursive to pull in files from nested directories within each category.

Parameters:
  • folder_path – Root directory containing the category folders to process.

  • folder_types – List of category folder names (e.g., [‘Images’]).

  • dest_folder_path – Base directory for the sorted folders. Defaults to folder_path when None.

  • plan_output – Optional JSON path override for the emitted plan.

  • auto_apply – If True, immediately executes the generated plan.

  • recursive – When True, scans inside nested directories under each category.

Returns:

Path to the JSON plan file.

Raises:

FileNotFoundError – If folder_path does not exist.

sort_by_extension(folder_path: str, dest_folder_path: str | None = None, ignore_dir: List[str] | None = None, plan_output: str | None = None, auto_apply: bool = False, recursive: bool = True) Path[source]#

Generates a plan to sort files by extension into subdirectories.

Parameters:
  • folder_path – Path to the directory containing unsorted files.

  • dest_folder_path – Base directory for the sorted category folders. Falls back to folder_path when None.

  • ignore_dir – Optional directory names to skip when scanning.

  • plan_output – Optional JSON path override for the emitted plan.

  • auto_apply – If True, immediately executes the generated plan.

  • recursive – When True (default), recursively scans the tree.

Returns:

Path to the JSON plan file.

Raises:

FileNotFoundError – If folder_path does not exist.

sort_by_regex(folder_path: str, regex: Dict[str, str], dest_folder_path: str, plan_output: str | None = None, auto_apply: bool = False, recursive: bool = True) Path[source]#

Generates a plan to sort files recursively based on regex patterns.

Scans folder_path (optionally including subdirectories) for files whose names match the provided regex patterns, then moves them to categorized folders within dest_folder_path.

Parameters:
  • folder_path – Path to the directory to scan recursively.

  • regex – Dictionary mapping category names to regex patterns.

  • dest_folder_path – Base directory where sorted files will be moved.

  • plan_output – Optional JSON path override for the emitted plan.

  • auto_apply – If True, immediately executes the generated plan.

  • recursive – When True (default), recursively scans the folder.

Returns:

Path to the JSON plan file.

Raises:
  • FileNotFoundError – If folder_path does not exist.

  • RuntimeError – If a critical error occurs while preparing the plan.

sort_by_type(folder_path: str, dest_folder_path: str | None = None, ignore_dir: List[str] | None = None, plan_output: str | None = None, auto_apply: bool = False, recursive: bool = False) Path[source]#

Generates a plan to sort files into subdirectories by file type.

Files in the top level of folder_path are mapped into subdirectories (e.g., “Images”, “Documents”) inside dest_folder_path. Enable recursive to scan the entire tree. The plan is written to JSON so it can be inspected or edited before execution.

Note

This method is memory-efficient and suitable for sorting directories with a very large number of files.

Parameters:
  • folder_path – Path to the directory containing unsorted files.

  • dest_folder_path – Base directory for the sorted category folders. Falls back to folder_path when None.

  • ignore_dir – Optional directory names to skip when scanning.

  • plan_output – Optional JSON path override for the emitted plan.

  • auto_apply – If True, immediately executes the generated plan.

  • recursive – When True, recursively scans nested folders.

Returns:

Path to the JSON plan file.

Raises:

FileNotFoundError – If folder_path does not exist.

class sortium.file_utils.FileUtils[source]#

Bases: object

Provides memory-efficient utilities for file and directory manipulation.

apply_move_plan(plan_file: str, reverse: bool = False, dry_run: bool = False) Dict[str, int | List[str]][source]#

Applies or reverses a JSON move plan produced by Sorter methods.

Parameters:
  • plan_file – Path to the JSON plan file to execute.

  • reverse – If True, moves files back to their source_path.

  • dry_run – If True, validates the plan without moving files.

Returns:

A summary dictionary containing entries, moved and errors keys.

Raises:

FileNotFoundError – If plan_file does not exist.

export_directory_structure(folder_path: str, output_file: str, ignore_dir: Sequence[str] | None = None) Path[source]#

Writes the directory tree rooted at folder_path to a JSON file.

Parameters:
  • folder_path – Directory whose structure should be traced.

  • output_file – Destination JSON file path.

  • ignore_dir – Optional iterable of additional directory or file names to skip alongside DEFAULT_IGNORE_ENTRIES.

Returns:

Path to the generated JSON file.

Raises:
  • FileNotFoundError – If folder_path does not exist.

  • NotADirectoryError – If folder_path is not a directory.

find_unique_extensions(source_path: str, ignore_dir: List[str] | None = None) Set[str][source]#

Recursively finds all unique file extensions in a directory.

This method is memory-efficient, scanning the directory tree without loading all paths into memory at once.

Parameters:
  • source_path – Path to the root directory to scan.

  • ignore_dir – Additional directory names to ignore alongside the built-in defaults (DEFAULT_IGNORE_ENTRIES).

Returns:

A set of unique file extensions (e.g., {“.txt”, “.jpg”}).

Raises:

FileNotFoundError – If source_path does not exist.

flatten_dir(folder_path: str, dest_folder_path: str, ignore_dir: Sequence[str] | None = None) None[source]#

Moves all files from a directory tree into a single destination folder.

This method recursively finds all files in folder_path and moves them to dest_folder_path. It does not preserve the original directory structure. It does not delete the original empty folders.

Note

This operation runs sequentially and does not remove the original (now empty) subdirectories.

Parameters:
  • folder_path – Path to the root folder to flatten.

  • dest_folder_path – Path to the single folder where all files will be moved.

  • ignore_dir – Additional directory names to ignore alongside the built-in defaults (DEFAULT_IGNORE_ENTRIES).

Raises:

FileNotFoundError – If folder_path does not exist.

get_file_modified_date(file_path: str) datetime[source]#

Returns the last modified datetime of a file.

Parameters:

file_path – Full path to the file.

Returns:

A datetime object for the last modification time.

Raises:

FileNotFoundError – If the file does not exist.

iter_all_files_recursive(folder_path: str, ignore_dir: Sequence[str] | None = None) Generator[Path, None, None][source]#

Recursively yields all files in a directory and its subdirectories.

This is a memory-efficient generator that does not load the entire file list into memory.

Parameters:
  • folder_path – Path to the root directory to scan.

  • ignore_dir – Additional directory names to ignore alongside the built-in defaults (DEFAULT_IGNORE_ENTRIES).

Yields:

A generator of Path objects for each file found.

iter_shallow_files(folder_path: str, ignore_dir: Sequence[str] | None = None) Generator[Path, None, None][source]#

Yields files in the top level of a directory.

This is a non-recursive generator.

Parameters:
  • folder_path – Path to the folder to iterate.

  • ignore_dir – Additional names to ignore alongside the built-in defaults (DEFAULT_IGNORE_ENTRIES).

Yields:

A generator of Path objects for each file.

plan_destination_path(source_path: str, dest_folder_path: str) Path[source]#

Predicts the collision-safe destination path for a file move.

Parameters:
  • source_path – Current location of the file.

  • dest_folder_path – Folder where the file is planned to be moved.

Returns:

Path of the file at the destination, including any rename that would be required to avoid collisions.

sortium.config.DEFAULT_FILE_TYPES: Dict[str, List[str]]#

Default file type categories and their associated file extensions.

Used to map file extensions to logical categories during file sorting.

Examples

>>> DEFAULT_FILE_TYPES["Images"]
['.jpg', '.jpeg', '.png', '.gif']
Categories include:
  • Images

  • Documents

  • Spreadsheets

  • Presentations

  • Videos

  • Music

  • Archives

  • Code

  • Executables

  • Fonts

  • Design

  • Others