Core architecture
- Introduced a
corepackage with shared orchestration helpers:core.collector.CollectorModulewraps existing collectors with metadata and text-report support.core.registry.CollectorRegistryperforms auto-discovery per target category and ensures the project root is onsys.pathbefore imports.core.runner.CollectorRunnerhandles prerequisite checks, banner execution, and optional text report writing.core.stylecentralises ANSI styling; collectors importstyleinstead of defining local classes.core.input_classifieroffers unified target classification (domain,email,ip,username) and private-IP detection.core.configloads values fromconfig.ini(falling back to legacyconfig.py) once and exposesget_config_valuefor collectors.
datasploit.pynow uses the classifier + runner and skips private IPs automatically.- Legacy entry points (
domainOsint.py,emailOsint.py,ipOsint.py,usernameOsint.py) have been removed in favour of a singledatasploit.pyentrypoint. Older workflows should switch topython datasploit.py -i <target>.
Collector requirements
- Every collector module must define:
ENABLED: boolean flag to toggle execution.MODULE_NAME: user-facing name (derived from filename when omitted).REQUIRES: tuple of config keys required before execution (empty tuple if none).
banner()implementations now return a plain string; the runner handles consistent formatting via the sharedstylehelper.- Modules missing
MODULE_NAMEorREQUIRESare skipped with an explanatory warning. - Many collectors now declare prerequisites (GitHub tokens, Shodan API key, etc.) so the runner can skip them gracefully when credentials are absent.
Configuration access
vault.pyis deprecated; callingvault.get_keyraises an error instructing authors to usecore.config.get_config_value.- All collectors use
get_config_valueinstead of rolling their own config parsing. - The user-facing configuration moved to
config.inito align with a familiar INI-style template and avoid name clashes withcore/config.py.
Package cleanup
- Module packages now auto-import their collectors so library users can access them via attributes like
datasploit.username.username_gitscrape. - Removed the
base.pyshim from all module packages (domain,emails,ip,username). Discovery now relies on the registry’s path injection. - Package
__init__files perform lightweight auto-imports instead of remaining empty stubs, preserving backwards-compatible attribute access. - Documentation (
docs/Writing_Modules.md) updated to reflect the new structure—no base helper or auto-import glue; collectors live directly in the package with the required metadata.
Dependency updates
requirements.txtandPipfilelist the exact dependencies used at runtime; obsolete Python 2 packages dropped, and version floors set for Python 3 compatibility.- Optional integrations (e.g.,
emails/email_hacked_emails.py) now handle missing optional dependencies gracefully (cfscrape).
Contributor guidance
- When adding a new collector:
- Place it under the appropriate package (
domain/,emails/,ip/,username/). - Name the file
category_source.py(e.g.,domain_newservice.py). - Define
ENABLED,MODULE_NAME,REQUIRES, and optionallyWRITE_TEXT_FILEandDESCRIPTION. - Implement
banner(),main(target),output(data, target);output_text(data)if you setWRITE_TEXT_FILE = True. - The shared runner auto-discovers it—no manual registry changes needed.
- Place it under the appropriate package (
config.inishould contain any new keys referenced inREQUIRES; users populate them with their credentials.
Refer to core/collector.py and docs/Writing_Modules.md for detailed examples.