Python API#

This section provides detailed documentation for the Python APIs in Coco-Pack.

Module Overview#

cocopack.figure_ops#

cocopack.figure_ops.slides_to_images(input_path, output_path, filename_format='figure{:01d}.png', crop_images=True, margin_size='1cm', dpi=300)[source]#

Convert presentation slides to image files.

Parameters:
  • input_path (str) – Path to the presentation file (.ppt, .pptx, or .key).

  • output_path (str) – Directory path where the images will be saved.

  • filename_format (str, optional) – Format string for the output filenames. Defaults to ‘figure{:01d}.png’.

  • crop_images (bool, optional) – Whether to crop whitespace around images. Defaults to True.

  • margin_size (str, optional) – Margin size to add around cropped images. Defaults to ‘1cm’.

  • dpi (int, optional) – DPI for the output images. Defaults to 300.

cocopack.figure_ops.convert_to_pdf(image_path, output_path=None, dpi=300, **kwargs)[source]#

Convert {PNG, JPEG, TIFF} images to high-quality PDF files.

Parameters:
  • image_path (str) – Path to an image file or a directory containing image files.

  • output_path (str, optional) – Path where the PDF files will be saved. If None, uses the same location as the input. Defaults to None.

  • dpi (int, optional) – DPI for the output PDF files. Defaults to 300.

  • **kwargs – Additional keyword arguments. pdf_only (bool): If True, removes the original image files. Defaults to False.

Returns:

None

cocopack.figure_ops.convert_images_to_pdf(input_path, dpi=300, **kwargs)[source]#

Convert all {PNG, JPEG, TIFF} images in a directory and its subdirectories to PDF files.

Parameters:
  • input_path (str) – Path to the directory containing {PNG, JPEG, TIFF} images.

  • dpi (int, optional) – DPI for the output PDF files. Defaults to 300.

  • **kwargs – Additional keyword arguments passed to convert_to_pdf. pdf_only (bool): If True, removes the original image files. Defaults to False.

cocopack.figure_ops.mogrify_images_to_pdf(input_path, **kwargs)[source]#

Convert {PNG, JPEG, TIFF} images to PDF using ImageMagick’s mogrify command.

Parameters:
  • input_path (str) – Path to the directory containing {PNG, JPEG, TIFF} images.

  • **kwargs – Additional keyword arguments. pdf_only (bool): If True, removes the original image files. Defaults to False.

Note

This function requires ImageMagick to be installed on the system.

cocopack.convert#

cocopack.convert.convert_image(source_path, target_format, **kwargs)[source]#

Convert an image file to another format.

Parameters:
  • source_path (str) – Path to the source image file.

  • target_format (str) – Target format to convert to (e.g., ‘jpg’, ‘png’, ‘pdf’).

  • **kwargs – Additional keyword arguments. remove_original (bool): Whether to remove the original file. Defaults to True.

Returns:

Path to the converted image file.

Return type:

str

cocopack.notebook#

cocopack.notebook.set_autoreload(level='complete')[source]#

Configure IPython’s autoreload extension with specified level.

This function configures IPython’s autoreload extension to automatically reload Python modules before executing code. This is useful during development to see changes to imported modules without restarting the kernel.

Parameters:

level (str, optional) – The autoreload level to set. Options are: - ‘off’: Disable autoreload - ‘light’: Reload only modules imported with %aimport - ‘complete’: Automatically reload all modules (default)

Return type:

None

Examples

>>> from cocopack import notebook
>>> notebook.set_autoreload('complete') # Enable full autoreload
>>> notebook.set_autoreload('off')  # Disable autoreload

Notes

This function must be run in an IPython environment (e.g., Jupyter notebook). It will raise an EnvironmentError if run in a standard Python interpreter.

cocopack.overleaf#

cocopack.overleaf.set_overleaf_root(overleaf_root=None)[source]#

Set the root directory for Overleaf projects.

Parameters:

overleaf_root (str, optional) – Path to the Overleaf root directory. If None, prompts the user to enter the directory. Defaults to None.

cocopack.overleaf.get_overleaf_root(overleaf_root=None)[source]#

Get the root directory for Overleaf projects.

Parameters:

overleaf_root (str, optional) – Path to the Overleaf root directory. If provided, returns this value. Defaults to None.

Returns:

Path to the Overleaf root directory. If not provided, tries to get it from

globals, environment variables, or prompts the user.

Return type:

str

cocopack.overleaf.get_overleaf_path(project_name, overleaf_root=None)[source]#

Get the full path to an Overleaf project.

Parameters:
  • project_name (str) – Name of the Overleaf project.

  • overleaf_root (str, optional) – Path to the Overleaf root directory. If None, gets it from get_overleaf_root(). Defaults to None.

Returns:

Full path to the Overleaf project.

Return type:

str

cocopack.overleaf.list_overleaf_projects(overleaf_root=None, exclusions=[], sort_by_date=True, **kwargs)[source]#

List all Overleaf projects in the root directory.

Parameters:
  • overleaf_root (str, optional) – Path to the Overleaf root directory. If None, gets it from get_overleaf_root(). Defaults to None.

  • exclusions (list, optional) – List of strings to filter out projects containing these substrings. Defaults to an empty list.

  • sort_by_date (bool, optional) – Whether to sort projects by modification date. Defaults to True.

  • **kwargs – Additional keyword arguments. verbose (bool): If True, prints projects with their last modified dates. Defaults to False.

Returns:

List of Overleaf project names.

Return type:

list

cocopack.overleaf.gather_submission(project_path, main_file, support_files, output_dir, **kwargs)[source]#

Gather LaTeX project files for submission, stitching files together and organizing references.

Parameters:
  • project_path (str) – Path to the project root directory.

  • main_file (str) – Name of the main LaTeX file.

  • support_files (list) – List of supporting files to include (images, bibtex, etc.).

  • output_dir (str) – Directory where gathered submission will be saved.

  • **kwargs – Additional keyword arguments. prepend_project (bool): If True, prepend project_path to output_dir. Defaults to False. fresh_start (bool): If True, clear the output directory if it exists. Defaults to True. main_name (str): Name for the output main file. Defaults to ‘manuscript.tex’. new_names (dict): Map of original filenames to new filenames. Defaults to {}. image_format (str): Convert images to this format if specified. Defaults to None. verbose (bool): If True, print detailed information. Defaults to False. stitch_bibtex (bool): If True, stitch bibtex files together. Defaults to True. exclude_comments (bool): If True, exclude commented lines when updating references. Defaults to True.

cocopack.overleaf.find_tex_inputs(project_dir, main_file='main.tex', depth=0, **kwargs)[source]#

Recursively find all LaTeX input{} commands in a main file and its included files.

Parameters:
  • project_dir (str) – Path to the project directory.

  • main_file (str, optional) – Name of the main LaTeX file. Defaults to ‘main.tex’.

  • depth (int, optional) – Current recursion depth. Defaults to 0.

  • **kwargs – Additional keyword arguments. max_depth (int): Maximum recursion depth. Defaults to 5. prepend_path (bool): If True, prepend the directory path to input files. Defaults to False.

Returns:

Nested dictionary representing the structure of the LaTeX files and their inputs.

Return type:

dict

cocopack.overleaf.find_all_inputs(project_path, main_file, stitch_first=False, **kwargs)[source]#

Find all files referenced in a LaTeX document through various commands.

This function scans a LaTeX document for references to other files through commands like input, includegraphics, ibliography, etc.

Parameters:
  • project_path (str) – Path to the project directory.

  • main_file (str) – Name of the main LaTeX file.

  • stitch_first (bool, optional) – If True, stitch all input files before searching. Defaults to False.

  • **kwargs – Additional keyword arguments. exclusions (list): List of strings to exclude files containing these substrings. files_only (bool): If True, return only file paths without match context. Defaults to False.

Returns:

Either a dictionary mapping file paths to their match context,

or a list of file paths if files_only=True.

Return type:

Union[dict, list]

cocopack.overleaf.stitch_tex_files(project_dir, main_file='main.tex', output_file=None, **kwargs)[source]#

Stitch together a LaTeX document by resolving all input commands.

Parameters:
  • project_dir (str) – Path to the project directory.

  • main_file (str, optional) – Name of the main LaTeX file. Defaults to ‘main.tex’.

  • output_file (str, optional) – Path where the stitched file will be saved. If None, the function will only return the content. Defaults to None.

  • **kwargs – Additional keyword arguments. exclude_with_comment (list): List of patterns to comment out instead of including. exclude (list): List of patterns to exclude from stitching. verbose (bool): If True, print detailed information. Defaults to False. content_only (bool): If True, only return the content without writing to a file. Defaults to True.

Returns:

The stitched LaTeX content.

Return type:

str

cocopack.overleaf.get_bibtex_dir(project_name, bibtex_dir='citation', **kwargs)[source]#

Get the path to the directory containing BibTeX files for a project.

Parameters:
  • project_name (str) – Name of the Overleaf project.

  • bibtex_dir (str, optional) – Name of the directory containing BibTeX files. Defaults to ‘citation’.

  • **kwargs – Additional keyword arguments. overleaf_root (str): Path to the Overleaf root directory.

Returns:

Path to the BibTeX directory.

Return type:

str

cocopack.overleaf.get_bibtex_files(project_path, bibtex_dir, other_dirs=[])[source]#

Get a list of BibTeX files in the specified directories.

Parameters:
  • project_path (str) – Path to the project root directory.

  • bibtex_dir (str) – Name of the primary directory containing BibTeX files.

  • other_dirs (list, optional) – List of additional directories to search for BibTeX files. Defaults to an empty list.

Returns:

List of relative paths to BibTeX files.

Return type:

list

cocopack.overleaf.clean_bibtex_file(input_file_path, output_file_path=None)[source]#

Remove commented lines from a BibTeX file.

Parameters:
  • input_file_path (str) – Path to the input BibTeX file.

  • output_file_path (str, optional) – Path where the cleaned file will be saved. If None, returns the cleaned content as a StringIO object. Defaults to None.

Returns:

StringIO object containing the cleaned content if output_file_path is None,

otherwise None.

Return type:

io.StringIO

cocopack.overleaf.stitch_bibtex_files(project_path, bibtex_files, output_file, cleanup=False, dry_run=True, **kwargs)[source]#

Combine multiple BibTeX files into a single file, removing duplicates.

Parameters:
  • project_path (str) – Path to the project root directory.

  • bibtex_files (Union[str, list]) – Either a directory containing BibTeX files or a list of BibTeX file paths.

  • output_file (str) – Path where the stitched file will be saved.

  • cleanup (bool, optional) – If True, delete or backup the original files. Defaults to False.

  • dry_run (bool, optional) – If True, don’t write the stitched file or perform cleanup. Defaults to True.

  • **kwargs – Additional keyword arguments. prepend_project (bool): If True, prepend project_path to output_file. Defaults to True. backup_dir (str): Directory where original files will be backed up, if cleanup is True. verbose (bool): If True, print detailed information. Defaults to False.

Returns:

None

cocopack.path_ops#

cocopack.path_ops.diffpath(path, root)[source]#

Get the relative path between two paths.

Parameters:
  • path (str) – The target path.

  • root (str) – The root path to compute the relative path from.

Returns:

The relative path from root to path.

Return type:

str

cocopack.path_ops.print_path_structure(root_dir, max_depth=2, include=None, exclude=None, **kwargs)[source]#

Print a hierarchical representation of a directory structure.

Parameters:
  • root_dir (str) – Path to the root directory to display.

  • max_depth (int, optional) – Maximum depth of directories to display. Defaults to 2.

  • include (Union[str, list], optional) – Pattern(s) to include in the output. Only entries containing these patterns will be shown. Defaults to None.

  • exclude (Union[str, list], optional) – Pattern(s) to exclude from the output. Entries containing these patterns will be excluded. Defaults to None.

  • **kwargs – Additional keyword arguments. whitespace (int): Number of spaces to add before each line. Defaults to 0.

cocopack.path_ops.list_packages(pkg_names=[], dir_paths=None, pkg_types=['site-packages'], file_types=['.py'], other_filters=[], **kwargs)[source]#

List and display package structures from Python’s import paths.

Parameters:
  • pkg_names (Union[str, list], optional) – Name(s) of packages to list. If empty, all packages in found directories will be listed. Defaults to [].

  • dir_paths (Union[str, list], optional) – Directory paths to search for packages. If None, uses sys.path. Defaults to None.

  • pkg_types (Union[str, list], optional) – Types of package directories to look for. Defaults to [‘site-packages’].

  • file_types (Union[str, list], optional) – File extensions to include in the output. Defaults to [‘.py’].

  • other_filters (Union[str, list], optional) – Additional patterns to filter by. Defaults to [].

  • **kwargs – Additional keyword arguments. global_root (str): Common root path for relative path display. max_depth (int): Maximum depth for print_path_structure. Defaults to 2.

cocopack.pacman#

cocopack.pacman.delete_git_files(folder_path, dry_run=True)[source]#

Delete all Git-related files and directories in a given folder.

Parameters:
  • folder_path (str) – Path to the folder to clean.

  • dry_run (bool, optional) – If True, only print the files that would be deleted without actually deleting them. Defaults to True.

cocopack.pacman.delete_ipynb_checkpoints(target_dir, dry_run=True)[source]#

Delete all Jupyter Notebook checkpoint directories in a given folder.

Parameters:
  • target_dir (str) – Path to the directory to clean.

  • dry_run (bool, optional) – If True, only print the directories that would be deleted without actually deleting them. Defaults to True.

cocopack.pacman.remove_kernel_metadata(notebook_path)[source]#

Remove kernel specification metadata from a Jupyter notebook.

Parameters:

notebook_path (str) – Path to the Jupyter notebook file.

cocopack.pacman.insert_colab_metadata(notebook_path)[source]#

Insert Google Colab metadata into a Jupyter notebook.

This function adds metadata that configures the notebook to use GPU acceleration with a T4 GPU type when opened in Google Colab.

Parameters:

notebook_path (str) – Path to the Jupyter notebook file.

cocopack.pacman.clear_ipynb_checkpoints(project_dir, dry_run=True)[source]#

Delete all Jupyter Notebook checkpoint directories in a project.

Parameters:
  • project_dir (str) – Path to the project directory to clean.

  • dry_run (bool, optional) – If True, only print the directories that would be deleted without actually deleting them. Defaults to True.

cocopack.pacman.clean_project_notebooks(project_dir, dry_run=True)[source]#

Clean Jupyter notebooks in a project by inserting Colab metadata.

Parameters:
  • project_dir (str) – Path to the project directory containing notebooks.

  • dry_run (bool, optional) – If True, only print the notebooks that would be modified without actually modifying them. Defaults to True.

cocopack.pacman.clear_git_files(project_dir, dry_run=True)[source]#

Delete all Git-related files and directories in a project.

Parameters:
  • project_dir (str) – Path to the project directory to clean.

  • dry_run (bool, optional) – If True, only print the files that would be deleted without actually deleting them. Defaults to True.

cocopack.pacman.tar_files(source, filename, include=None, exclude=None, hidden=False, fmt='bz2', dry_run=True)[source]#

Create a tar archive of files from a source directory or list of files.

Parameters:
  • source (Union[str, list]) – Either a directory path or a list of file paths to include.

  • filename (str) – Base name for the output tar file (without extension).

  • include (list, optional) – List of patterns to include in the archive. If provided, only files matching these patterns will be included. Defaults to None.

  • exclude (list, optional) – List of patterns to exclude from the archive. Files matching these patterns will be excluded. Defaults to None.

  • hidden (bool, optional) – If True, include hidden files (starting with ‘.’). Defaults to False.

  • fmt (str, optional) – Compression format to use (‘bz2’ or ‘gz’). Defaults to ‘bz2’.

  • dry_run (bool, optional) – If True, only return the list of files that would be included without creating the archive. Defaults to True.

Returns:

If dry_run is True, returns the list of files that would be included.

Return type:

list

Raises:

ValueError – If an unsupported format is specified or if source is invalid.

cocopack.pacman.get_file_size(file_path, unit_format='MB')[source]#

Get the size of a file in the specified unit format.

Parameters:
  • file_path (str) – Path to the file.

  • unit_format (str, optional) – Unit to return the size in. Options are ‘B’, ‘KB’, ‘MB’, ‘GB’, ‘TB’, ‘PB’, ‘EB’, ‘ZB’, ‘YB’. Defaults to ‘MB’.

Returns:

Size of the file in the specified unit.

Return type:

float

Raises:

ValueError – If an unsupported unit format is specified.

cocopack.pacman.get_exclusions(*exclusion_specs, path_set=None, cache=True, exclude_by_size=False, max_file_size='20MB')[source]#

Get a list of file patterns to exclude based on specified criteria.

Parameters:
  • *exclusion_specs – Variable number of exclusion specifications. These can be categories like ‘image’, ‘video’, ‘audio’, etc.

  • path_set (Union[str, list], optional) – Either a directory path or a list of file paths to check for exclusions by size. Defaults to None.

  • cache (bool, optional) – If True, include ‘.cache’ in exclusions. Defaults to True.

  • exclude_by_size (bool, optional) – If True, exclude files larger than max_file_size. Defaults to False.

  • max_file_size (str, optional) – Maximum file size as a string with unit (e.g., ‘20MB’). Defaults to ‘20MB’.

Returns:

List of file patterns and paths to exclude.

Return type:

list