Subset Selection
Results
Selected Indices:
Diversity Calculation
About
Motivation
Selecting diverse and representative subsets is crucial for the data-driven models and machine learning applications in many science and engineering disciplines, especially for molecular design and drug discovery. Motivated by this, we develop the Selector package, a free and open-source Python library for selecting diverse subsets.
The Selector
library implements a range of existing algorithms for subset sampling based on the distance between and similarity of samples, as well as tools based on spatial partitioning. In addition, it includes seven diversity measures for quantifying the diversity of a given set. We also implemented various mathematical formulations to convert similarities into dissimilarities.
Selector Library
Selector is a free, open-source, and cross-platform Python library designed to help you effortlessly identify the most diverse subset of molecules from your dataset. Please use the following citation in any publication using Selector library:
Citation
Please use the following citation in any publication using the Selector
library:
To be added
More Information
For more information about the Selector library, please visit our GitHub repository and documentation at https://selector.qcdevs.org.
Acknowledgments
This webserver is supported by the DRI EDIA Champions Pilot Program. We are grateful to the Digital Research Alliance for providing the computing resources.
Contact
The Selector
source code is hosted on GitHub and is released under the GNU General Public License v3.0.
We welcome any contributions to the Selector library in accordance with our Code of Conduct; please see our Contributing Guidelines.
Please report any issues you encounter while using Selector library on GitHub Issues.
For further information and inquiries, please contact us at qcdevs@gmail.com.