




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules Niklas W. A. Gebauer Machine Learning Group Technische Universitt Berlin 10587 Berlin, Germany n.wa.gebauer Michael Gastegger Machine Learning Group Technische Universitt Berlin 10587 Berlin, Germany michael.gastegg
2、ertu-berlin.de Kristof T. Schtt Machine Learning Group Technische Universitt Berlin 10587 Berlin, Germany kristof.schuetttu-berlin.de Abstract Deep learning has proven to yield fast and accurate predictions of quantum- chemical properties to accelerate the discovery of novel molecules and materi- al
3、s. As an exhaustive exploration of the vast chemical space is still infeasible, we require generative models that guide our search towards systems with desired properties. While graph-based models have previously been proposed, they are restricted by a lack of spatial information such that they are
4、unable to recognize spa- tial isomerism and non-bonded interactions. Here, we introduce a generative neural network for 3d point sets that respects the rotational invariance of the targeted structures. We apply it to the generation of molecules and demonstrate its ability to approximate the distribu
5、tion of equilibrium structures using spatial metrics as well as established measures from chemoinformatics. As our model is able to capture the complex relationship between 3d geometry and electronic properties, we bias the distribution of the generator towards molecules with a small HOMO-LUMO gap a
6、n important property for the design of organic solar cells. 1Introduction In recent years, machine learning enabled a signifi cant acceleration of molecular dynamics simula- tions 17 as well as the prediction of chemical properties across chemical compound space 813. Still, the discovery of novel mo
7、lecules and materials with desired properties remains a major chal- lenge in chemistry and materials science. Accurate geometry-based models require the atom types and positions of candidate molecules in equilibrium, i.e. local minima of the potential energy sur- face. These have to be obtained by g
8、eometry optimization, either using computationally expensive quantum chemistry calculations or a neural network trained on both compositional (chemical) and confi gurational (structural) degrees of freedom 1316. On the other hand, bond-based methods that predict chemical properties using only the mo
9、lecular graph 1720 do not reach the accuracy of geometry-based methods 11,21. In particular, they lack crucial information to distinguish spatial isomers, i.e. molecules that exhibit the same bonding graph, but different 3d structures and chemical properties. As an exhaustive search of chemical spac
10、e is infeasible, several methods have been proposed to generate molecular graphs 22,23 and to bias the distribution of the generative model 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. towards desired chemical properties for a guided search 24,25. Natur
11、ally, these models suffer from the same drawbacks as bond-based predictions of chemical properties. In this work, we propose G-SchNet (Generative SchNet) for the generation of rotational invariant 3d point sets. Points are placed sequentially according to a learned conditional probability distributi
12、on that refl ects the symmetries of the previously placed points by design. Therefore, G-SchNet conserves rotational invariance as well as local symmetries in the predicted probabilities. We build upon the SchNet architecture 14 using continuous-fi lter convolutional layers to model the interactions
13、 of already placed points. This allows local structures such as functional groups as well as non-bonded interactions to be captured. Note, that we explicitly do not aim to generate molecular graphs, but directly the atomic types and positions, which include the full information necessary to solve th
14、e electronic problem in the Born-Oppenheimer approximation. This allows us to avoid abstract concepts such as bonds and rings, that are not rooted in quantum mechanics and rely heavily on heuristics. Here, we use these constructs only for evaluation and comparison to graph-based approaches. In this
15、manner, our approach enables further analysis using atomistic simulations after generation. In particular, this work provides the following main contributions: We propose the autoregressive neural network G-SchNet for the generation of 3d point sets incorporating the constraints of Euclidean space a
16、nd rotational invariance of the atom distribution as prior knowledge1. We apply the introduced method to the generation of organic molecules with arbitrary composition based on the QM9 dataset 2628. We show that our model yields novel and accurate equilibrium molecules while capturing spatial and st
17、ructural properties of the training data distribution. We demonstrate that the generator can be biased towards complex electronic properties such as the HOMO-LUMO gap an important property for the design of organic solar cells. We introduce datasets containing novel, generated molecules not present
18、in the QM9 dataset. These molecules are verifi ed and optimized at the same level of theory used for QM92. 2Related work Existinggenerativemodelsfor3dstructuresareusuallyconcernedwithvolumetricobjectsrepresented as point cloud data 29, 30. Here, the density of points characterizes shapes of the targ
19、et structures. This involves a large amount of points where the exact placement of individual points is less important. Furthermore, these architectures are not designed to capture rotational invariances of the modeled objects. However, the accurate relative placement of points is the main requireme
20、nt for the generation of molecular structures. Other generative models for molecules resort to bond-based representations instead of spatial struc- tures. They use text-processing architectures on SMILES 31 strings (e.g. 19,3240) or graph deep neural networks on molecular graphs (e.g. 2225,41,42). A
21、lthough these models allow to generate novel molecular graphs, ignoring the 3d positions of atoms during generation discards valuable information connected to possible target properties. Mansimov et al.43have proposed an approach to sample 3d conformers given a specifi c molecular graph. This could
22、be combined with aforementioned graph-based generative models in order to generate 3d structures in a two-step procedure. For the targeted discovery of novel structures, both parts would have to be biased appro- priately. Again, this can be problematic if the target properties crucially depend on th
23、e actual spatial confi guration of molecules, as this information is lacking in the intermediate graph representation. G-SchNet, in contrast, directly generates 3d structures without relying on any graph- or bond-based information. It uses a spatial representation and can be trained end-to-end, whic
24、h allows biasing towards complex, 3d structure-dependent electronic properties. Previously, we have proposed a similar factorization to generate 3d isomers given a fi xed composition of atoms 44. The G-SchNet architecture improves upon this idea as it can deal with arbitrary compositions of atoms an
25、d introduces auxiliary tokens which drastically increase the scalability of the model and the robustness of the approximated distribution. 1The code is publicly available at 2The datasets are publicly available at . 2 3Symmetry-adapted factorization of point set distributions
26、We aim to factorize a distribution over point sets that enables autoregressive generation where new points are sampled step by step. Here, the conditional distributions for each step should be symmetry- adapted to the previously sampled points. This includes invariance to permutation, equivariance t
27、o rotation and translation3as well as the ability to capture local symmetries, e.g. local rotations of functional groups. A 3d point set consists of a variable numbernof positionsRnand typesZn. Each position ri R3is associated with a typeZi N. When generating molecules, these types correspond to che
28、mical elements. Beyond those, we use auxiliary tokens, i.e. additional positions and types, which are not part of a generated point set but support the generation process. A partially generated point set includingtof such auxiliary tokens is denoted asRt i = (r1,.,rt,rt+1,.,rt+i)and Zt i = (Z1,.,Zt,
29、Zt+1,.,Zt+i) , where the fi rsttindices correspond to tokens and all following positions and types indicate actual points. We defi ne the distribution over a collection of point sets as the joint distribution of positions and types,p(Rn,Zn). Instead of learning the usually intractable joint distribu
30、tion directly, we factorize it using conditional probabilities p(Rn,Zn) = n Y i=1 p ?r t+i,Zt+i|Rti1,Zti1 ? # p(stop|Rt n,Z t n) (1) thatpermitasequentialgenerationprocessanalogtoautoregressivegraphgenerativenetworks22,24 or PixelCNN 45 . Note that the distribution over the fi rst point in the set,p
31、(rt+1,Zt+1|Rt 0,Z t 0), is only conditioned on positions and types of the auxiliary tokens and that, depending on the tokens used, its positionrt+1might be determined arbitrarily due to translational invariance. In order to allow for differently sized point sets,p(stop|Rt n,Z t n)determines the prob
32、ability of stopping the generation process. While the number and kind of tokens can be adapted to the problem at hand, there is one auxiliary token that is universally applicable and vital to the scalability of our approach: the focus point ci= (r1,Z1) . It has its own artifi cial type and its posit
33、ion can be selected randomly at each generation step from already placed points (r1 rt+1,.,rt+i1). We then require that the new point is in the neighborhood ofci which may be defi ned by a fi xed radial cutoff. This effectively limits the potential space of new positions to a region of fi xed size a
34、roundcithat does not grow with the number of preceding points. Beyond that, we rewrite the joint distribution of type and position such that the probability of the next position rt+idepends on its associated type Zt+i, p ?r t+i,Zt+i|Rti1,Zti1 ? = p(rt+i|Zt+i,Rt i1,Z t i1)p(Zt+i|R t i1,Z t i1) (2) wh
35、ich allows us to sample the type fi rst and the position afterwards at each generation step. Note that, due to the auxiliary focus token, the sampling of types can already consider some positional information, namely that the next point will be in the neighborhood of the focus point. While the discr
36、ete distribution of types is rotationally invariant, the continuous distribution over positions transforms with already placed points. Instead of learning the 3d distribution of absolute positions directly, we construct it from pairwise distances to preceding points (including auxiliary tokens) whic
37、h guarantees the equivariance properties: p(rt+i|Rt i1,Z t i) = 1 t+i1 Y j=1 p(d(t+i)j|Rt i1,Z t i). (3) Hereis the normalization constant andd(t+i)j= |rt+i rj|2is the distance between the new position and a preceding point or token. As the probability ofrt+iis given in these symmetry-adapted coordi
38、nates, the predicted probability is independent of translation and rotation. Transformed to Cartesian coordinates, this leads to rotationally equivariant probabilities of positions in 3d space according to Eq. 3. Since we employ a focus point, the space to be considered at each step remains small re
39、gardless of the size of the generated point set. 3 positions embedding 128 interaction 128 interaction 128 atom types origin current focus unfi nished molecule embedding 128 all types + stop token dense 6x1 softmax 6 interaction 128 dense 6x128 sample new type embedding 128 dense 128 dense 300 softm
40、ax 300 calculate grid: sample new position 2. Determine next type1. Atom-wise features3. Distance probabilities4. Sample position Interaction blocks equivalent to SchNet 14 using continuous-fi lter convolutions: Final molecule repeat from 1. until stop has been predicted for each atom Figure 1: Sche
41、me of G-SchNet architecture with an exemplary generation step where an atom is sampled given two already placed atoms and two auxiliary tokens: the focus and the origin. The complete molecule obtained by repeating the depicted procedure is shown in the lower right. Network layers are presented as bl
42、ocks where the shape of the output is given. The embedding layers share weights and the?-operator represents element-wise multiplication of feature vectors. All molecules shown in fi gures have been rendered with the 3d visualization package Mayavi 46. 4G-SchNet 4.1Neural network architecture In ord
43、er to apply the proposed generative framework to molecules, we adopt the SchNet architecture originally developed to predict quantum-chemical properties of atomistic systems 5,14. It allows to extract atom-wise features that are invariant to rotation and translation as well as the order of atoms. Th
44、e main building blocks are an embedding layer, mapping atom types to initial feature vectors, and interaction blocks that update the features according to the atomic environments using continuous-fi lter convolutions. We use the atom representation as a starting point to predict the necessary probab
45、ility distributions outlined above. Fig. 1 gives an overview of the proposed G- SchNet architecture and illustrates one step of the previously described generation process for a molecule. The fi rst column shows how SchNet extracts atom-wise features from previously placed atoms and the two auxiliar
46、y tokens described below in section 4.2. For our experiments, we use nine interaction blocks and 128 atom features. In order to predict the probability of the next type (Eq. 2), we reuse the embedding layer of the feature extraction to embed the chemical elements as well as the stop token, which is
47、treated as a separate atom type. The feature vectors are copied once for each possible next type and multiplied entry-wise with the corresponding embedding before being processed by fi ve atom-wise dense layers with shifted softplus non-linearity4 and a softmax. We obtain the fi nal distribution as
48、follows: p(Zt+i|Rti1,Zt i1) = 1 t+i1 Y j=1 p(Zt+i|xj).(4) The third column of Fig. 1 shows how the distance predictions are obtained in a similar fashion where the atom features of the sampled type from column 2 are reused. A fully-connected network yields a distribution over distances for every inp
49、ut atom, which is discretized on a 1d grid. For 3If we consider a point set invariant to rotation and translation, the conditional distributions have to be equivariant to fulfi ll this property. 4as used in SchNet 14: ssp(x) = ln(0.5ex + 0.5) 4 + focus token+ origin token Figure 2: Effect of the int
50、roduced tokens on the distribution over positions of a new carbon atom. our experiments, we covered distances up to15 with a bin size of approximately0.05. The fi nal probability distribution of positions is obtained as described in Eq. 3. We use an additional temperature parameter that allows contr
51、olling the randomness during generation. The exact equation and more details on all layers of the architecture are provided in the supplement. 4.2Auxiliary tokens We make use of two auxiliary tokens with individual, artifi cial types that are not part of the generated structure. They are fed into th
52、e network and infl uence the autoregressive predictions analog to regular atoms of a molecule. This is illustrated in Fig. 2, where we show the distribution over positions of a new carbon atom given two already placed carbon atoms with and without the placement of additional tokens. Suppose the most
53、 probable position is next to either one of the already placed atoms, but not between them. Then, without tokens, both carbon atoms predict the same distance distribution with two peaks: at a distance of one bond length and at the distance of two bond lengths to cover both likely positions of the ne
54、xt atom. This leads to high placement probabilities between the atoms as an artifact of the molecular symmetry. This issue can be resolved by the earlier introduced focus pointci. Beyond localizing the prediction for scalability, the focus token breaks the symmetry as the new atom is supposed to be
55、placed in its neighborhood. Thus, the generation process is guided to predict only one of the two options and the grid distribution is free of phantom positions (Fig. 2, middle). Consequently, the focus token is a vital part of G-SchNet. A second auxiliary token marks the origin around which the mol
56、ecule grows. The origin token encodes global geometry information, as it is located at the center of mass of a molecule during training. In contrast to the focus token, which is placed on the currently focused atom for each placement step, the position of the origin token remains fi xed during gener
57、ation. The illustration on the right-hand side of Fig. 2 shows how the distribution over positions is further narrowed by the origin token. We have carried out ablation studies where we remove the origin token from our architecture. These experiments are detailed in the supplement and have shown tha
58、t the origin token signifi cantly improves the approximation of the training data distribution. It increases the amount of validly generated structures as well as their faithfulness to typical characteristics of the training data. 4.3Training For training, we split each molecule in the training set
59、into a trajectory of single atom placement steps. We sample a random atom placement trajectory in each training epoch. This procedure is described in detail in the supplement. The next type at each stepican then be one-hot encoded by qtype i . To obtain the distance labelsqdist ij , the distancesd(t+i)jbetween the preceding points (atoms and tokens) and the atom to be placed are expanded using Gaussians located at the respecti
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025年中国南水北调集团中线有限公司夏季招聘4人笔试历年参考题库附带答案详解
- 仙桃市事业单位2025年统一公开招聘笔试历年典型考题及考点剖析附带答案详解
- 海南省万宁市2025年事业单位公开招聘工作人员(第5号)笔试历年典型考题及考点剖析附带答案详解
- 毕节市事业单位2025年面向社会公开招聘笔试联考笔试历年典型考题及考点剖析附带答案详解
- 2025年铜陵市铜官区事业单位公开招聘笔试原始笔试历年典型考题及考点剖析附带答案详解
- 2025年新公共营养师三级真题及答案解析大全
- 小学生秋冬保健知识课件
- 2025年能源行业智能电网在数字化转型中的智能运维与故障预测报告
- 信息工程招标管理办法
- 人才工作标识管理办法
- 劳务派遣与服务协议
- 消费者权益保护培训课件
- DB11T 2454-2025 职业健康检查质量控制规范 生物样本化学物质检测
- 贸易公司员工职业操守行为准则制度
- 电气安全基础知识安全培训
- 部门保密培训课件
- 福建省南平市2024-2025学年八年级下学期期末考试数学试卷(含答案)
- 电网工程设备材料信息参考价2025年第一季度
- SLAP损伤的治疗课件
- 以理解为中心的历史教育 西安张汉林 全国历史教育专家2016年夏高考研讨会最新材料
- 拆除锅炉施工方案
评论
0/150
提交评论