############################################################################# Data Repository and Archive for: ############################################################################# Title: Tuning conformational asymmetry in particle-forming diblock copolymer alloys. Authors: Logan J. Case Frank S. Bates Kevin D. Dorfman Journal: Soft Matter Year: 2023 DOI: 10.1039/d2sm01332k ############################################################################# Software ############################################################################# The C++ version of the PSCF software was used for calculations in this work. Github: https://github.com/dmorse/pscfpp.git The exact source code version used on this project is also included in the repository in directory root/pscfpp_code/ Analysis of data was done via Jupyter Notebooks hosted on MSI resources. ############################################################################# Data Files and Naming Conventions ############################################################################# To reduce the size of the repository, many of the raw data files (such as fields and data files output by PSCFpp) have been omitted from this repository. Instead, the repository contains all files necessary to rerun the calculations and regenerate the raw data files. Thoughout this repository, anywhere that raw data was contained (and where it may be regenerated) you will find the following files: param files Parameter files read by PSCF. These may have different prefixes ( such as `up_param` or `down_param` ) but the filename will always end with `param` command files Command sequences executed by PSCF for the calculation. These will always be named `command`. initial potential fields These are chemical potential fields in the symmetry-adapted basis format. These are the initial fields used for the calculation. They will be named either `in.omega` or `in.bf` jobscript These are shell scripts used to launch a batch job via SLURM on the MSI supercomputer system. These will show you the command used when running the calculation. Files related to data parsing and data analysis are all included. A brief overview of relevant files: sweepData.csv After running calculations, the raw data files output by PSCFpp were parsed using a python script (dataCollection.py) and the relevant data were placed in a CSV file. These files will appear a directory above where the calculation was actually completed. Wherever this file is seen, the calculations reflected in the file will have been run in a sub-directory named for each phase. For example, within this repository at [root]/chi/chi30_0/ you will find a sweepData.csv file, in addition to directories named for A15, C14, and other phases. The data in [root]/chi/chi30_0/sweepData.csv represents the data at each state point converged during the sweeps in the [root]/chi/chi30_0/[phase]/ sub-directories. These .csv files contain essentially all relevant phase data. Column labels for the CSV are listed below, with a description of the data contained in that column. --------------- phase : Name of the phase, as used for the directory (i.e. A15, bccAB, althex_i) prefix : The data in this row was obtained from file [phase]/[prefix].dat relative to the directory containing the sweepData.csv file. fHelm : Helmholtz free energy returned by PSCFpp kuhn0, kuhn1, kuhn2 : Statistical segment length of monomer A (kuhn0), B (kuhn1) and C (kuhn2). chiAB, chiAC, chiBC : Flory-Huggins parameter between monomers AB, AC, and BC, respectively. Na0, Nb0 : The length of the A (Na0) and B (Nb0) blocks in the AB chain. Nc1, Nb1 : The length of the C (Nc1) and B` (Nb1) blocks in the B`C chain. phi0, phi1 : The blend fraction of AB (phi0) and B`C (phi1) chains. If calculation is Canonical ensemble, this is specified in the param file. If Grand Canonical, this is returned by PSCFpp based on the converged field. mu0, mu1, dMu : The chemical potential of the AB (mu0) and B`C (mu1) chains, as well as the difference between them ( dMu = mu1 - mu0 ). If calculation is Canonical ensemble, this is the value returned by PSCFpp, which has no useful interpretation since the Pressure is unspecified. dMu values are used occasionally as an estimate when starting Grand Canonical calculations. system, cellparam0, cellparam1 : The crystal system (system) of the unit cell (cubic, hexagonal, etc) and unit cell parameters (cellparam0, cellparam1) from the converged solution. Unit cell parameters are as output by PSCFpp. If only one unit cell parameter is required (such as in cubic unit cells) the last column (cellparam1) will be blank for the row. *.ipynb All data analysis was done in Jupyter Notebooks, hosted on the MSI clusters. All of these Notebooks are included in the repository and contain analysis code as well as many results. *.pickle After computing the common tangents for each canonical dataset, the python objects used to organize that data (and storing the common tangent) were written out to a file via pickle. Loading objects from these pickled files meant that there was no need to rerun the common tangents. These files are all generated by and should only be read from the Jupyter notebooks used for analysis. contained_data_files.txt This is a text file generated by the dataCollection.py python script. It contains a list of the (absolute) path to all `sweepData.csv` files. The paths are absolute as stored on MSI, thus they will not be strictly accurate on any other system. They can be used as a reference to see where calculations were run relative to the alloys root directory. In general, scripts used to set up calculations, launch SLURM jobs, and organize data will not be of much interest. Many of these setup scripts and template files have been moved from their original locations into a sub-directory called 'setupScripts/'. When a directory of this name is seen, the scripts it contains should be run from one directory up (where you see `setupScripts/`). These scripts are not very cleanly written, nor very robust. Generally, directories or files with names such as `startData` `setup` `init` or other such phrases will relate to this. ############################################################################# Repository Structure ############################################################################# To reduce the size of the repository, many of the raw data files have been excluded. Instead, the repository contains all files necessary to rerun the calculations and regenerate the raw data files. Files generated from analysis of data, or data collections (such as csv files) are still included. These contain all data relevant for the analysis. ================================= root/ ================================= The root directory contains several files. A few of interest `README` This file `phase_diagram_analysis.ipynb` Jupyter Notebook in which analysis was done to produce the phase diagrams. `*.json` JSON files used to store the many data series that make up a phase diagram. Several of these are incremental backups of each other, i.e. `phaseDiagSeries*.json` The most recent version of data for the phase diagram in Fig. 4 are found in `phaseDiagSeries_new.json` Data used for Fig. 5 are found in `ac15_phaseDiagSeries.json`. `Tangent_Figures.pdf` A PDF document containing all common tangent figures used during analysis of Canonical ensemble data. The figures are organized according to conditions at which data was collected, and the figure in which the data was used. The root also contains 4 sub-directories that are of interest regarding presented results. (3 others are merely for setup) ================================= root/kuhn/ ================================= This directory contains the results from analyzing the impact of chain length and conformational asymmetry (Fig 2 and S2). Data in this directory is organized in a tiered manner. Directories root/kuhn/chi25/ and root/kuhn/chi28/ contain the data at chiN=25 and chiN=28, respectively. Within each of these directories are sub-directories a/, c/, and ac/ which identify the statistical segment length being varied to produce conformational asymmetries Thus, data in a/ varies epsilon_AB, c/ varies epsilon_BC, and ac/ varies both epsilon values. Each directory chi[25,28]/[a,c,ac]/ contains an init/ directory used to sweep upward in statistical segment length to introduce conformational asymmetry. They also contain sub-directories kuhn100/, kuhn125/, kuhn150/, which contain data at the referenced conformational asymmetry for the appropriate chain; kuhn100 is conformationally symmetric, kuhn125 and kuhn150 are conformational asymmetries of 1.25 and 1.5, respectively. Finally, each kuhn[100,125,150]/ directory contains an init directory used to sweep the chain lengths to acquire starting fields for each sweep in phi. They also contain directories for each chain length asymmetry, named according to the length of the B`C chain (with AB chain fixed at N_AB = 1) as `nbc_#_#` where the `#` values symbolize the digits before and after a decimal point. Thus, nbc_0_5 contains data for N_BC/N_AB=0.5, while nbc_1_2 contains data for N_BC/N_AB=1.2. The calculations were actually performed in the phase-named sub-directories below this, with the `sweepData.csv` files being found in the nbc_#_# directories (see entry above on the .csv files). ================================= root/chi/ ================================= Canonical ensemble calculations for the primary phase diagram (Fig. 4) for a system with N_BC/N_AB = 1.0 and epsilon_BC = 1.5. This directory contains its own README which can be referenced. ================================= root/grandcanonical/ ================================= Canonical ensemble calculations for the primary phase diagram. This directory contains its own README file describing its contents. Unlike other directory trees in this repository, the grandcanonical/ directory contains all of its raw data files. ================================= root/ac_conf_asym/canonical ================================= Canonical ensemble calculations for the case of both chains having conformational asymmetry (Figure 5)