MDFactoryMDFactory

chemistry_utilities

Molecular graph analysis for lipid head/tail detection.

funccheck_branch_points_not_in_cycles(mol, branch_point_indices, exclude_elements=[])

Filter branch points to those not part of any ring.

parammolrdkit.Chem.Mol

RDKit molecule object.

parambranch_point_indiceslist of int

Atom indices representing branch points.

paramexclude_elementslist of str
= []

Element symbols to keep even if they are in a cycle.

Returns

list of int

Branch point indices that are not in a cycle (or whose element is in exclude_elements).

funcremove_leaves(graph)

Remove leaf nodes (nodes with only one connection) from the graph.

paramgraph

Returns

None
funcdetect_lipid_parts_from_smiles_modified(smiles, head_search_radius=3, min_tail_distance=6)

Detect head group, tail termini, and branch points of a lipid from SMILES.

Parse the molecular graph, trim terminal atoms, identify branch points, and classify endpoints as head group or tail atoms.

paramsmilesstr

SMILES string of the lipid molecule.

paramhead_search_radiusint
= 3

Maximum graph distance to search for head-group heteroatoms from trimmed endpoints. Default is 3.

parammin_tail_distanceint
= 6

Minimum graph distance from the head-group branch point required for a valid tail endpoint. Default is 6.

Returns

int or None

Atom index of the detected head group, or None if parsing fails.

funcclassify_endpoints(mol, graph, endpoints, branch_indices, head_search_radius, min_tail_distance=10)

Classify trimmed-graph endpoints as head group or tail atoms.

Search for the head group by looking for terminal hydroxyl, nearby nitrogen, farthest nitrogen, farthest oxygen, or any heteroatom (in that priority order). Remaining endpoints that pass distance and element filters become tails.

parammolrdkit.Chem.Mol

RDKit molecule object.

paramgraphnetworkx.Graph

Full molecular connectivity graph.

paramendpointslist of int

Endpoint atom indices from the twice-trimmed graph.

parambranch_indiceslist of int

Atom indices of non-cyclic branch points.

paramhead_search_radiusint

Maximum graph distance to search for head-group heteroatoms.

parammin_tail_distanceint
= 10

Minimum graph distance from the head-group branch point for a valid tail. Default is 10.

Returns

int or None

Atom index of the detected head group.

funcfind_closest_node_single_source(graph, source_node, candidate_nodes, cutoff=2)

Find the closest candidate node to a source using shortest-path distance.

paramgraphnetworkx.Graph

Molecular connectivity graph.

paramsource_nodeint

Starting node index.

paramcandidate_nodeslist of int

Node indices to consider as targets.

paramcutoffint
= 2

Maximum path length to search. Default is 2.

Returns

int or None

Index of the closest candidate, or None if none reachable.

funcmap_to_original_terminals(graph, tail_indices)

Map trimmed-graph endpoints back to the nearest original terminal atoms.

paramgraphnetworkx.Graph

Full molecular connectivity graph (before trimming).

paramtail_indiceslist of int

Atom indices from the trimmed graph to map back.

Returns

list of int

Corresponding terminal atom indices in the original graph.

funcremove_duplicates_and_sort(connections)

Deduplicate and sort a list of atom-index pairs.

paramconnectionslist of list of int

Pairs (or sequences) of atom indices, possibly with duplicates or reversed ordering.

Returns

list of list of int

Unique connections, each internally sorted.

funcanalyze_molecular_graph(mol, headgroup_index, tail_indices, branch_indices)

Partition a molecular graph into segments between key structural points.

Trace paths between head group, tail termini, and branch points via BFS, then assign every remaining atom to the nearest segment.

parammolrdkit.Chem.Mol

RDKit molecule object.

paramheadgroup_indexint

Atom index of the head group.

paramtail_indiceslist of int

Atom indices of tail termini.

parambranch_indiceslist of int

Atom indices of branch points.

Returns

list of list of int

Segments of atom indices, each sorted, covering the full molecule.

funcfind_connections_from_point(adjacency_list, start, points_of_interest)

Find all BFS paths from a start atom to other points of interest.

paramadjacency_listlist of list of int

Per-atom neighbor lists for the molecule.

paramstartint

Starting atom index (must be in points_of_interest).

parampoints_of_interestset of int

Atom indices at which to terminate paths.

Returns

list of list of int

Each entry is a path (list of atom indices) from start to another point of interest.

funcassign_adjacent_atoms(mol, segments, points_of_interest)

Assign unassigned atoms to the nearest existing segment.

Atoms not already in any segment and not in points_of_interest are appended to whichever segment contains their closest neighbor by shortest-path distance.

parammolrdkit.Chem.Mol

RDKit molecule object.

paramsegmentslist of list of int

Existing segments of atom indices (modified in place).

parampoints_of_interestset of int

Head, tail, and branch atom indices (excluded from assignment).

Returns

list of list of int

The input segments with unassigned atoms appended.

funcvisualize_lipid_parts_from_smiles(smiles, output_file='lipid_parts.png')

Visualize detected lipid head and tail groups from SMILES.

Render a 2D image of the molecule with head (red), tail (blue), and branch (green) atoms highlighted. Intended for use in a Jupyter notebook.

paramsmilesstr

SMILES string of the lipid molecule.

paramoutput_filestr
= 'lipid_parts.png'

Path for the output PNG image. Default is "lipid_parts.png".

Returns

tuple of (rdkit.Chem.Mol, int, list of int, list of int) or None

(mol, head_index, tail_indices, branch_indices) on success, or None if the molecule cannot be parsed.

funccreate_lipid_assignment(mol, head_index, tail_indices, branch_indices, output_file='lipid_assignment.png')

Create a color-coded visualization of lipid segment assignments.

Partition the molecule into segments between head, tail, and branch points, then render each segment in a distinct color.

parammolrdkit.Chem.Mol

RDKit molecule object.

paramhead_indexint

Atom index of the head group.

paramtail_indiceslist of int

Atom indices of tail termini.

parambranch_indiceslist of int

Atom indices of branch points.

paramoutput_filestr
= 'lipid_assignment.png'

Path for the output PNG image. Default is "lipid_assignment.png".

Returns

list of list of int

Segments of atom indices used in the visualization.