<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Effex app documentation – Modeling</title>
    <link>/documentation/docs/software/modeling/</link>
    <description>Recent content in Modeling on Effex app documentation</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/documentation/docs/software/modeling/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Docs: Upload a dataset</title>
      <link>/documentation/docs/software/modeling/experiment-data/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/documentation/docs/software/modeling/experiment-data/</guid>
      <description>
        
        
        &lt;h3 id=&#34;uploading-the-data&#34;&gt;&lt;strong&gt;Uploading the data&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Before the dataset is uploaded, choose one of the two following options for headers:&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=&#34;/documentation/documentation/img/headers_modeling.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;300&#34; width=&#34;300&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;This is to indicate if there is a header in the dataset that you will upload. After specifying this, copy the dataset from an excel file, click on the plus sign as presented below&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=&#34;/documentation/documentation/img/plus_sign_upload.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;100&#34; width=&#34;100&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;and paste the data. On windows, the data can be pasted by pressing &lt;strong&gt;CTRL + V&lt;/strong&gt;. On mac, this can be done by pressing &lt;strong&gt;command⌘ + V&lt;/strong&gt;.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&#34;steps-to-follow-after-pasting-the-data&#34;&gt;&lt;strong&gt;Steps to follow after pasting the data&lt;/strong&gt;&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;By default, the data in each column of the dataset is assigned to be quantitative. If the user wishes to change some of these default assignments to categorical, then clicking on the dropdown menu&lt;/p&gt;
 &lt;figure&gt;
 &lt;img src=&#34;/documentation/documentation/img/column_q_or_c.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;120&#34; width=&#34;120&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;will allow the user to make this change.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The last column of the dataset is assumed to be a response column, and the remaining columns are assumed to be factor columns. To change this original assignment toggle between the options provided below&lt;/p&gt;
 &lt;figure&gt;
 &lt;img src=&#34;/documentation/documentation/img/column_class.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;150&#34; width=&#34;150&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;It is possible to have more than one response column. However, all response columns must strictly have data of the quantitative type for modeling. If a column is assigned to be categorical, then this column will not be allowed to be set as a response column.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;p&gt;Here is a screenshot of a dataset uploaded successfully with the appropriate selections for the options described above.&lt;/p&gt;
  &lt;figure&gt;
  &lt;img src=&#34;/documentation/documentation/img/example_dataset.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;950&#34; width=&#34;950&#34;&gt;
  &lt;/figure&gt;
&lt;p&gt;In our example, we take a dataset from the paper of Derringer &amp;amp; Suich (1980). This dataset comes from an experiment where researchers study the effect of three input factors, namely, SILICA, SILANE and SULFUR, on the quality of rubber tires. The quality of the rubber tires is measured using multiple response variables, of which we consider two in our dataset, namely, ABRASION and ELONG (i.e Elongation at break).&lt;/p&gt;
&lt;p&gt;After the dataset has been defined appropriately using the options above, click on &lt;strong&gt;Save dataset&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Derringer, G., &amp;amp; Suich, R. (1980). Simultaneous Optimization of Several Response Variables. Journal of Quality Technology, 12(4), 214–219.&lt;/li&gt;
&lt;/ol&gt;

      </description>
    </item>
    
    <item>
      <title>Docs: Launch a modeling calculation</title>
      <link>/documentation/docs/software/modeling/response_selection/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/documentation/docs/software/modeling/response_selection/</guid>
      <description>
        
        
        &lt;p&gt;In this page, the user can specify which response columns should be used for modeling by using the sliding toggle buttons (enclosed in a black box in the image below) presented for each row corresponding to each of the response columns.&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=&#34;/documentation/documentation/img/Response_inclusion.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;400&#34; width=&#34;900&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;After a response has been selected, the following options emerge.&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=&#34;/documentation/documentation/img/response_options.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;400&#34; width=&#34;700&#34;&gt;
&lt;/figure&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model definition&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;In the model definition dropdown menu, the following options appear&lt;/p&gt;
 &lt;figure&gt;
 &lt;img src=&#34;/documentation/documentation/img/model_definition_response.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;400&#34; width=&#34;300&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;This option can be used to specify all model terms that need to be considered for modeling. A more detailed explanation of these three types of models are as follows:&lt;/p&gt;
 &lt;ol&gt;
 &lt;li&gt;Main effects (this includes only first-order effects)
 &lt;li&gt;Main and interaction effects (this includes first-order effects as well as two-factor interaction effects)
 &lt;li&gt;Main and second-order effects (this includes first-order effects, two-factor interaction effects and quadratic effects)
 &lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Intercept&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;The user can choose to include or exclude the intercept from modeling using the following drop down menu&lt;/p&gt;
 &lt;figure&gt;
 &lt;img src=&#34;/documentation/documentation/img/intercept_inclusion.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;400&#34; width=&#34;300&#34;&gt;
 &lt;/figure&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model heredity&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For model heredity, there are &lt;strong&gt;three choices&lt;/strong&gt;: Strong, Weak and No heredity.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strong heredity&lt;/strong&gt;:&lt;br&gt;
• If an interaction effect is included, then both the linear effects of the involved factors are also included in the model.&lt;br&gt;
• If a quadratic effect is included, then the linear effect of the corresponding factor is also included in the model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Weak heredity&lt;/strong&gt;:&lt;br&gt;
• If an interaction effect is active, then one of the linear effects of the involved factors is also included in the model.&lt;br&gt;
• If a quadratic effect is active, then the linear effect of the corresponding factor is also included in the model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No heredity&lt;/strong&gt;:&lt;br&gt;
No strong, nor weak heredity.&lt;/p&gt;
&lt;p&gt;Note: Ockuly et. al (2017) observed that in real experiments, strong heredity occurs more frequently than weak heredity, which in turn occurs more frequently than no heredity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Transformations&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The software allows the reponse variable to be transformed if the user wishes to do so. To transform a specific response before modeling, the following options are offered: Original, Sqrt, Log. The &lt;strong&gt;original&lt;/strong&gt; option leaves the response as is, the &lt;strong&gt;sqrt&lt;/strong&gt; option takes the square root of the response, and the &lt;strong&gt;log&lt;/strong&gt; option takes the log of each value in the response column to the base 2. To use the sqrt transformation, all values in the original response column must be greater than or equal to 0, and for the log transformation, all values must be strictly greater than 0. For a given response, all three options can be selected together, in which case three separate analyses are done for each transformation type.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After all above options are correctly selected, click on the &lt;strong&gt;Launch modeling&lt;/strong&gt; button to begin the calculations for model selection. The model selection is performed using the method proposed by Vazquez et. al (2021).&lt;/p&gt;
&lt;p&gt;The user will be notified via the &lt;strong&gt;Notifications tab&lt;/strong&gt; when the calculations are complete. When the modeling calculations are completed, the user can navigate to the &lt;a href=&#34;/documentation/documentation/docs/software/my-doe-items/&#34;&gt;&lt;strong&gt;My DoE items&lt;/strong&gt;&lt;/a&gt; tab, locate their data set under the option &lt;a href=&#34;/documentation/documentation/docs/software/my-doe-items/saved_datasets/&#34;&gt;&lt;strong&gt;Data sets&lt;/strong&gt;&lt;/a&gt;, and select the specific dataset to review the modeling results as shown below. For documentation on modeling results, refer to this &lt;a href=&#34;/documentation/documentation/docs/software/my-doe-items/saved_datasets/&#34;&gt;page&lt;/a&gt;.&lt;/p&gt;
&lt;figure&gt;
&lt;img src=&#34;/documentation/documentation/img/my_doe_items.png&#34; align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;700&#34; width=&#34;1000&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Ockuly, R. A., Weese, M. L., Smucker, B. J., Edwards, D. J., &amp;amp; Chang, L. (2017). Response surface experiments: A meta-analysis. Chemometrics and Intelligent Laboratory Systems, 164, 64-75.&lt;/li&gt;
&lt;li&gt;Vazquez, A. R., Schoen, E. D., &amp;amp; Goos, P. (2021). A mixed integer optimization approach for model selection in screening experiments. Journal of Quality Technology, 53 (3), 243-266.&lt;/li&gt;
&lt;/ol&gt;
&lt;!-- In statistical modeling, certain terms can be modeling certain responses, it Heredity is a principle that concerns t --&gt;
      </description>
    </item>
    
    <item>
      <title>Docs: Modeling results</title>
      <link>/documentation/docs/software/modeling/modeling_results/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/documentation/docs/software/modeling/modeling_results/</guid>
      <description>
        
        
        &lt;hr&gt;
&lt;h3 id=&#34;modeling-options-and-filters&#34;&gt;&lt;strong&gt;Modeling options and filters&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;In the &lt;strong&gt;Modeling results&lt;/strong&gt; page, the results that are displayed are based on two specifications: The selected response variable and the transformation type selected.&lt;/p&gt;
&lt;p&gt;At the top of this page, you will find two boxes that indicate this. An example is given below:&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=/documentation/documentation/img/response_with_transform.png align=&#34;center&#34; width=&#34;40%&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;In the &lt;strong&gt;Response&lt;/strong&gt; box, the name of the corresponding response variable is displayed and in the &lt;strong&gt;Transformation&lt;/strong&gt; box, the specific transformation of the response variable used in the modeling is displayed.&lt;/p&gt;
&lt;p&gt;If the modeling calculations for this dataset were completed for multiple response variables, then the dropdown menu for the Response box (refer image above) will allow the user to switch between modeling results for the different reponse variables.&lt;/p&gt;
&lt;p&gt;If a response variable was modeled using more than one type of transformation, then the dropdown menu for the &lt;strong&gt;Transformation&lt;/strong&gt; box (refer image above) will allow the user to switch between modeling results for the different transformations for the specific response variable specified in the Response box.&lt;/p&gt;
&lt;h3 id=&#34;other-options-for-filtering&#34;&gt;&lt;strong&gt;Other options for filtering&lt;/strong&gt;&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Forcing and removing effects&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If the user has some prior knowledge that certain effects must be included in all models that they wish to consider, the user can specify this using the &lt;strong&gt;Force effects in the model&lt;/strong&gt; dropdown option as shown below.&lt;/p&gt;
 &lt;figure&gt;
   &lt;img src=/documentation/documentation/img/modeling_force_effects.png align=&#34;center&#34; height=&#34;400&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;The dropdown list will display all effects that were originally considered for the analysis. This option allows the user to specify more than one effect to include in all models. Similarly, if the user has some prior knowledge that certain effects must be excluded in all models that they wish to consider, the user can specify this using the &lt;strong&gt;Force effects NOT in the model&lt;/strong&gt; dropdown option.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Heredity&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If the user wishes to review models that obey a certain type of heredity assumption, this can be specified using the checkboxes provided under the heading &lt;strong&gt;Heredity&lt;/strong&gt; (as given below).&lt;/p&gt;
 &lt;figure&gt;
   &lt;img src=/documentation/documentation/img/modeling_heredity.png align=&#34;center&#34; height=&#34;200&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;A description of the concept of heredity can be found &lt;a href=&#34;/documentation/documentation/docs/software/modeling/Response_selection/&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To include all models that satisfy the strong heredity assumption, tick the box with the label &amp;lsquo;Strong&amp;rsquo;, to include all models that satisfy the weak heredity assumption, tick the box with the label &amp;lsquo;Weak&amp;rsquo;, and finally, to include all models that do not satify neither strong nor weak heredity, tick the box with the label &amp;lsquo;No-heredity&amp;rsquo;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;h3 id=&#34;plots-with-modeling-results&#34;&gt;&lt;strong&gt;Plots with modeling results&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;For the specified combination of response variable and type of transformation, three graphs/plots are provided:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Effects in the generated models rasterplot&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Here is an example of such a plot.&lt;/p&gt;
 &lt;figure&gt;
   &lt;img src=/documentation/documentation/img/raster_plot_example.png align=&#34;center&#34; width=&#34;95%&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;This is called a raster plot which can be viewed as a table with certain number of rows and columns. Each row in this table corresponds to one specific model. For a given row, the number of highlighted cells indicates the number of effects that are present in that model, while the different columns corresponding to the highlighted cells, indicate which specific effects are present in that model.&lt;/p&gt;
&lt;p&gt;For example, in the image given above, there are 26 rows, which means that there are 26 models presented in the raster plot. The last row at the bottom indicates that the model corresponding to this particular row, has only one effect which is the main effect of &amp;lsquo;SILANE&amp;rsquo; corresponding to column number two. By hovering with the mouse pointer, you can review other rows (models) and columns (effects) too.&lt;/p&gt;
&lt;p&gt;All models in the raster plot are ranked by the size of the models, with models that have a larger number of terms appearing at the top and the ones with a smaller number of terms appearing at the bottom.&lt;/p&gt;
&lt;p&gt;On hovering over a particular cell, the &amp;lsquo;x&amp;rsquo; value indicates the effect or column that cell corresponds to, the &amp;lsquo;y&amp;rsquo; value indicates the model or row that the cell corresponds to, and the &amp;lsquo;z&amp;rsquo; value indicates a value which matches the intensity of the color highlighted in the cell. The &lt;strong&gt;color code&lt;/strong&gt; of the cells is determined by the following option&lt;/p&gt;
 &lt;figure&gt;
   &lt;img src=/documentation/documentation/img/color_code.png align=&#34;center&#34; width=&#34;200&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;If the &lt;strong&gt;Effects&lt;/strong&gt; option is selected, for a given row (model), a red colored cell indicates that the corresponding effect has a positive value for its coefficient in that model, while a blue colored cell indicates a negative value. The darker the color, the more signficant is the value (positive or negative) for the coefficient.&lt;/p&gt;
 &lt;div style=&#34;display: flex; align-items: center;&#34;&gt;
   &lt;p&gt;
   If the &lt;strong&gt;P-values&lt;/strong&gt; option is selected, for a given row (model), the colors of the cells are divided (according to the legend in the image to the right) by the following ranges for the p-values:&lt;br&gt;
 • Between 1 and 0.75 &lt;br&gt;
 • Between 0.75 and 0.50 &lt;br&gt;
 • Between 0.50 and 0.20 &lt;br&gt;
 • Between 0.20 and 0.10 &lt;br&gt;
 • Between 0.10 and 0.05 &lt;br&gt;
 • Between 0.05 and 0.01 &lt;br&gt;
 • Less than 0.01 &lt;br&gt;&lt;/p&gt;
 &lt;img src=/documentation/documentation/img/p_value_legend.png alt=&#34;Description&#34; style=&#34;width: 50px; margin-right: 30px;&#34;&gt;
 &lt;/div&gt;
&lt;p&gt;Note that for any mixed model analysis involving one or more random effects (i.e variance components), all p-values that are reported are based on the corrected degrees of freedom calculated according to the method of Kenward and Roger (1997)&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;All models that are displayed in the raster plot are charcterized based on metrics that quanitify the statistical quality of each of the models. To see this, click on&lt;/p&gt;
 &lt;figure&gt;
   &lt;img src=/documentation/documentation/img/modeling_graphical_filter.png align=&#34;center&#34; height=&#34;50&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;This will open up a &lt;a href=&#34;https://en.wikipedia.org/wiki/Parallel_coordinates&#34;&gt;parallel coordinate plot&lt;/a&gt; which allows you to visualize all models that appear in the raster plot based on the total number of effects, root mean squared error (RMSE), R&lt;sup&gt;2&lt;/sup&gt; adjusted, corrected AIC (Hurvich and Tsai 1989&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;) and &lt;a href=&#34;https://en.wikipedia.org/wiki/PRESS_statistic&#34;&gt;PRESS&lt;/a&gt;. For mixed model analysis, the conditional R&lt;sup&gt;2&lt;/sup&gt; adjusted is reported which also takes into account the variance in the response explained by the random effects (i.e variance components).&lt;/p&gt;
&lt;p&gt;An example of such a plot is given below&lt;/p&gt;
 &lt;figure&gt;
   &lt;img src=/documentation/documentation/img/modeling_pcp.png align=&#34;center&#34; width=&#34;700&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;This plot is interactive and therefore allows the user to specify vertical constraints on each of the five provided criteria. For example, to display all models with 5 and 6 terms, define a constraint as follows:&lt;br&gt;&lt;/p&gt;
&lt;p&gt;Hover close to the vertical axis corresponding to the number of effects. The mouse pointer will change to a plus sign. When this happens, click on the vertical line in a region just below the value 5 and drag the mouse pointer to an area just above 6 and leave. A pink line will appear confirming you specification. Additional constraints can be specified on other vertical axes corresponding to other model ranking criteria. To remove a constraint, hover over the pink line which you would like to remove and when the cursor turns to a sign that points in the up and down direction, click once and the pink line will be removed. Note that only one constraint can be specified per axis.&lt;/p&gt;
&lt;p&gt;After all constraints are specified, click the &lt;strong&gt;OK&lt;/strong&gt; button, upon which the user will observe that only the models that satisfy the specified constraints on the model ranking criteria will remain in the raster plot. The plots discussed below will also be updated due to this filtering.&lt;/p&gt;
&lt;p&gt;To practice using such a plot, refer &lt;a href=&#34;/documentation/documentation/docs/software/catalog-search/pcp/&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
    &lt;iframe src=&#34;https://www.youtube.com/embed/videoseries?list=PLYVwhy3SGC2ChX_7I-QHltLn_SjKqYSmy&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; allowfullscreen title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
&lt;/div&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Box plots&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Here is an example of such a plot.&lt;/p&gt;
 &lt;figure&gt;
   &lt;img src=/documentation/documentation/img/modeling_boxplot.png align=&#34;center&#34; height=&#34;400&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;The different effects are marked on the x-axis, where for each effect a boxplot is presented. For a given effect, its corresponding boxplot shows the distribution of values based on the option set for the option &lt;strong&gt;color code&lt;/strong&gt;. If the &lt;strong&gt;color code&lt;/strong&gt; is set to &amp;lsquo;Coefficient&amp;rsquo;, then the boxplot for a given effect shows the distribution of coefficient values across all models displayed in the raster plot which include that effect. The example plot above displays the coefficient sizes for each of the effects in the example dataset. Similarly, if the &lt;strong&gt;color code&lt;/strong&gt; is set to &amp;lsquo;P-values&amp;rsquo;, then the boxplot gives the distribution of p-values across all models displayed in the raster plot which include that effect&lt;/p&gt;
&lt;p&gt;For a given effect, if there is a wider distribution of values in the boxplot, this suggests that the values for the coefficient/p-values is highly variable across all models that are displayed in the raster plot.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Effect frequencies&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Here is an example of such a plot.&lt;/p&gt;
 &lt;figure&gt;
   &lt;img src=/documentation/documentation/img/modeling_effect_summary.png align=&#34;center&#34; HSPACE=&#34;20&#34; height=&#34;400&#34; width=&#34;900&#34;&gt;
 &lt;/figure&gt;
&lt;p&gt;The different effects are marked on the x-axis, where for each effect a barplot is presented. For a given effect, its corresponding barplot gives the total number of times it appears across all models that are displayed in the raster plot. For a given effect, the bigger the barplot, the greater number of times this effect appears across all models presented in the raster plot. This plot helps identify which effects consistently appear across all models.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;h3 id=&#34;additional-options&#34;&gt;&lt;strong&gt;Additional options&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Show Intercept&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This option is presented as follows&lt;/p&gt;
  &lt;figure&gt;
    &lt;img src=/documentation/documentation/img/show_intercept_modeling.png align=&#34;center&#34; width=&#34;40%&#34;&gt;
  &lt;/figure&gt;
&lt;p&gt;where this option can be toggled on or off. When on, the rasterplot, boxplot and the effect frequency plot, will display additional columns to indicate which effects did not appear in all models that presented in the raster plot.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id=&#34;choosing-a-final-model&#34;&gt;&lt;strong&gt;Choosing a final model&lt;/strong&gt;&lt;/h3&gt;
&lt;br&gt;
&lt;p&gt;There are two ways to select a final model: &lt;strong&gt;recommended&lt;/strong&gt; and &lt;strong&gt;manual&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;If the user prefers the software to recommend a single best model, click on the following button&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=/documentation/documentation/img/get_recommended_model.png align=&#34;center&#34; width=&#34;30%&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;When this button is clicked, the software will recommend the best model of all models that appear in the raster plot using the utopia method considering the following criteria: corrected AIC (Hurvich and Tsai 1989) and PRESS. More information on the utopia method can be found &lt;a href=&#34;/documentation/documentation/docs/software/comparisons/utopia/&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To select a final model manually, use the &lt;strong&gt;graphical filtering&lt;/strong&gt; and the other options discussed above to filter out the single model of interest, and then click on the &lt;strong&gt;Get recommended model&lt;/strong&gt; button.&lt;/p&gt;
&lt;p&gt;On clicking the &lt;strong&gt;Get recommended model&lt;/strong&gt; button, all model diagnostic results will appear for the selected model.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;section class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34; role=&#34;doc-endnote&#34;&gt;
&lt;p&gt;Kenward, M. G., &amp;amp; Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 983-997.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34; role=&#34;doc-endnote&#34;&gt;
&lt;p&gt;Hurvich, C. M., and C.-L. Tsai. (1989). Regression and time series model selection in small samples. Biometrika 76(2):297–307.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/section&gt;

      </description>
    </item>
    
    <item>
      <title>Docs: Model details (for mixed models)</title>
      <link>/documentation/docs/software/modeling/model_details_mixed/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/documentation/docs/software/modeling/model_details_mixed/</guid>
      <description>
        
        
        &lt;p&gt;On clicking the &lt;strong&gt;Get recommended model&lt;/strong&gt; button, all model diagnostic results will appear for the selected model. This screen has six tabs:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Summary&lt;/li&gt;
&lt;li&gt;Effect details&lt;/li&gt;
&lt;li&gt;Q-Q plot&lt;/li&gt;
&lt;li&gt;Diagnostic graphs&lt;/li&gt;
&lt;li&gt;Actual vs Predicted&lt;/li&gt;
&lt;li&gt;Variance components&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;1. Summary&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this summary tab, the model is described using many quantitive and qualitative measures, out of which some measures describe the statistical quality of the selected model. These are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Root mean squared error (RMSE): The RMSE is an estimate for the standard deviation. More generally, it is an estimate that quantifies the prediction accuracy of the selected model. The lower the value for RMSE, the better is the prediction accuracy. This is calculated as follows:
$$RMSE = \sqrt{\frac{\sum_{i=1}^{N}{(y_i - \widehat{y}_i)}^2}{N-p}}$$
where &lt;br&gt;
$N$ &amp;ndash; total number of observations&lt;br&gt;
$i=1,\dots,N$&lt;br&gt;
$y_i$ &amp;ndash; observed value for the observation number $i$&lt;br&gt;
$\widehat{y}_i$ &amp;ndash; predicted value for the observation number $i$ conditioned on the random effect (i.e predicted value based on the conditional model)&lt;br&gt;
$p$ &amp;ndash; number of effects in the selected model including the intercept.&lt;br&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Conditional $R^2$ and Marginal $R^2$: The conditional $R^2$ is the value for the coefficient of determination including the variance in the response explained by the variance components, while the marginal $R^2$ excludes the variance associated with the variance components (see Piepho 2023&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; for more details).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Akaike information criterion (AIC) and corrected AIC (See &lt;a href=&#34;https://en.wikipedia.org/wiki/Akaike_information_criterion&#34;&gt;here&lt;/a&gt;)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Bayesian information criterion (&lt;a href=&#34;https://en.wikipedia.org/wiki/Bayesian_information_criterion&#34;&gt;BIC&lt;/a&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Predicted residual error sum of squares (&lt;a href=&#34;https://en.wikipedia.org/wiki/PRESS_statistic&#34;&gt;PRESS&lt;/a&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Condition number: Ratio between the maximum and minimum eigen value of the information matrix of the model matrix.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;2. Effect details&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this tab, the details for the statistical tests performed on each effect is displayed in a table. For each effect in the selected model, the following information is provided:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The effect&amp;rsquo;s &lt;strong&gt;coefficient value&lt;/strong&gt; (2nd column). This quantifies the size of the effect&amp;rsquo;s contribution in changing the average value of the response.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The &lt;strong&gt;standard error&lt;/strong&gt; for the effect&amp;rsquo;s coefficient (3rd column). Since the true coefficient values are unknown and are estimated from the data, there is some uncertainty around the estimated coefficient value. The standard error quantifies the uncertainty associated with the estimated coefficient value.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The 4th and 5th columns &lt;strong&gt;&#39;[0.025&#39;&lt;/strong&gt; and &lt;strong&gt;&amp;lsquo;0.975]&#39;&lt;/strong&gt; give the lower and upper bounds for the coefficient values calculated at 0.025 and 0.975 percentiles with a confidence of 95%.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Degrees of Freedom&lt;/strong&gt; (6th column): For linear mixed models, the denominator degrees of freedom used in the t-tests is calculated based on the method of Kenward and Roger (1997)&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; which takes into account the random components and provides a more accurate test for statistical significance for all the effects.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;T-statistic&lt;/strong&gt; (7th column) is a measure that is calculated to assess the statisfical significance of an effect. For a given effect, if the T-statistic value if much larger than 0, it is likely that the effect is statistically significant.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;P-value&lt;/strong&gt; (8th column) is a measure that is based on the T-statistic, which quantifies the probability to observe the estimated coefficient value just by chance or when the true coefficient value is zero. The lower the p-value, the more likely that that the effect is statistically signicant. For a given effect, if the p-value is lower than 0.05, we say that the effect is statistically significant.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;3. Q-Q plot&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;A &lt;a href=&#34;https://en.wikipedia.org/wiki/Q%E2%80%93Q_plot&#34;&gt;Q-Q plot&lt;/a&gt; is used to check if the residuals for all observations are normally distributed. Here is an example of such a chart.&lt;/p&gt;
  &lt;figure&gt;
    &lt;img src=/documentation/documentation/img/qq_plot.png align=&#34;center&#34; width=&#34;90%&#34;&gt;
  &lt;/figure&gt;
&lt;p&gt;The x-axis represents the magnitude for the residuals, and the y-axis represents the z-score corresponding to each individual residual after sorting all residuals. If all points on the plot lie close to the straight black line, then this is an indication that the normality assumption for the residuals is satisfied.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;4. Diagnostic graphs&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this tab several model diagnostic plots are presented.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Marginal residuals vs Actual response&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this plot, for the marginal model which excludes the variance explained by the random effects, the actual response values (x-axis) are plotted against the residuals. Here is an example of such a plot:&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=/documentation/documentation/img/marginal_vs_actual.png align=&#34;center&#34; width=&#34;90%&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;If all points in this plot are randomly scattered around the central line without any visible trend, this indicates that the residuals obtained from the selected marginal model satisfy the assumption of homoscedasticity and independence for all residuals across all the actual observed values for the response. However, if there is a trend, then this may be a cause for concern as this indicates that the residuals vary in a systematic manner with respect to the actual value and that the assumptions of independence and homoscedasticity may be violated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conditional standardized residuals vs Predicted response&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this plot, for the conditional model which includes the variance explained by the random effects, the predicted values (x-axis) are plotted against the standardized residuals (y-axis). Here is an example of such a plot:&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=/documentation/documentation/img/conditional_standardized_res_vs_predicted.png align=&#34;center&#34; width=&#34;90%&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;If all points in this plot are randomly scattered around the central line without any visible trend, this indicates that the standardized residuals obtained from the selected conditional model satisfy the assumption of homoscedasticity and independence for all standardized residuals across all the predicted values for the response. However, if there is a trend, then this may be a cause for concern as this indicates that the standardized residuals vary in a systematic manner with respect to the predicted value and that the assumptions of independence and homoscedasticity may be violated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conditional standardized residuals vs Row number&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this plot, for the conditional model which includes the variance explained by the random effects, the predicted values (x-axis) are plotted against the row number of each data point (y-axis). Here is an example of such a plot:&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=/documentation/documentation/img/conditional_standardized_res_vs_row.png align=&#34;center&#34; width=&#34;90%&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;If all points in this plot are randomly scattered around the central line without any visible trend, this indicates that the residuals obtained from the selected model satisfy the assumption of homoscedasticity and independence for all residuals across the run order. However, if there is a trend, then this may be a cause for concern as this indicates that the run order is important to consider in the model and that the assumptions of independence and homoscedasticity may be violated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cook&amp;rsquo;s distance vs Row number&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this plot, the Cook&amp;rsquo;s distances (y-axis) are plotted against the row number of each data point (x-axis). The Cook&amp;rsquo;s distance (Cook, 1977&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;) reflects how influential a data point is in determining the estimates for all the fixed effects. Here is an example of such a plot:&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=/documentation/documentation/img/cooks_vs_row.png align=&#34;center&#34; width=&#34;90%&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;Ideally, all data points have similar values for the Cook&amp;rsquo;s distance. Data points with larger values for the cook&amp;rsquo;s distance are more influential in determining the estimates for all the fixed effects. Therefore, deleting the data points with large values for the Cook&amp;rsquo;s distance will produce big differences in the estimates for all the fixed effects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;MDFFITS vs Row number&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In this plot, the MDFFITS values (y-axis) are plotted against the row number of each data point (x-axis). Similar to the Cook&amp;rsquo;s distance, MDFFITS or multivariate DFFITS (Belsley, D.A., Kuh, E., and Welsch, R.E., 1980&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt;) value also reflects how influential a data point is in determining the estimates for the marginal model. The difference between the two criteria is that the latter uses the variance covariance matrix for the fixed effects after deleting the specific data point, while the former uses the same variance covariance matrix for all Cook&amp;rsquo;s distance calculations. Therefore, this chart will be very similar to the one discussed above. Here is an example of such a plot:&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=/documentation/documentation/img/mdffits_vs_row.png align=&#34;center&#34; width=&#34;90%&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;Ideally, all data points have similar values for the MDFFITS. Data points with larger values for the MDFFITS are more influential in determining the estimates for all the fixed effects. Therefore, deleting the data points with large values for the MDFFITS will produce big differences in the estimates for all the fixed effects.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;5. Actual vs Predicted&lt;/strong&gt;
In this tab, a scatter plot is displayed where for each observation, the actual value of the response variable (on the y-axis) are plotted against its predicted value (on the x-axis). Such a plot is a diagnostic tool to make sure that all predicted values are similar to the actual values in the dataset. Here is an example of such a plot.&lt;/p&gt;
  &lt;figure&gt;
    &lt;img src=/documentation/documentation/img/actual_vs_predicted.png align=&#34;center&#34; width=&#34;90%&#34;&gt;
  &lt;/figure&gt;
&lt;p&gt;If all points on the plot lie close to the straight line, then this is an indication that all predicted values obtained using the selected model are close to the true observed values in the dataset, and hence the model performs well in terms of predicting values for the response variable.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;6. Variance components&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;(This section will soon be updated.)&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;If the selected model passes all visual diagnostic checks, click on&lt;/p&gt;
  &lt;figure&gt;
    &lt;img src=/documentation/documentation/img/select_model_opt.png align=&#34;center&#34; width=&#34;10%&#34;&gt;
  &lt;/figure&gt;
&lt;p&gt;to continue to use this selected model to proceed to the optimization step to find the optimal settings of the input factors.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Note:
If the selected model does not pass all visual diagnostic checks, click on the following button&lt;/p&gt;
&lt;figure&gt;
  &lt;img src=/documentation/documentation/img/back_model_details.png align=&#34;center&#34; width=&#34;10%&#34;&gt;
&lt;/figure&gt;
&lt;p&gt;to return to the &lt;strong&gt;Modeling results&lt;/strong&gt; page to choose another model.&lt;/p&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;section class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34; role=&#34;doc-endnote&#34;&gt;
&lt;p&gt;Piepho, Hans‐Peter (2023). &amp;ldquo;An adjusted coefficient of determination (R2) for generalized linear mixed models in one go.&amp;rdquo; Biometrical Journal 65.7.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34; role=&#34;doc-endnote&#34;&gt;
&lt;p&gt;Kenward, M. G., &amp;amp; Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 983-997.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34; role=&#34;doc-endnote&#34;&gt;
&lt;p&gt;Cook, R.D. (1977), “Detection of Influential Observations in Linear Regression,” Technometrics, 19, 15–18.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34; role=&#34;doc-endnote&#34;&gt;
&lt;p&gt;Belsley, D.A., Kuh, E., and Welsch, R.E. (1980), Regression Diagnostics; Identifying Influential Data and Sources of Collinearity, New York: John Wiley &amp;amp; Sons&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/section&gt;

      </description>
    </item>
    
  </channel>
</rss>
