Additional functions

library(cardinalR)

These are helper functions included in the package.

Generating background noise

The gen_bkgnoise() function allows users to generate multivariate Gaussian noise to serve as background data in high-dimensional spaces.

# Example: Generate 4D background noise
bkg_data <- gen_bkgnoise(n = 500, p = 4, 
                         m = c(0, 0, 0, 0), s = c(2, 2, 2, 2))
head(bkg_data)
#> # A tibble: 6 × 4
#>       x1     x2     x3      x4
#>    <dbl>  <dbl>  <dbl>   <dbl>
#> 1 -1.39   1.87   2.06  -1.93  
#> 2  0.639  2.57   2.46  -1.26  
#> 3  2.57  -1.72   1.93  -0.340 
#> 4  5.00   1.98  -0.239 -1.53  
#> 5 -0.224 -0.630  0.849 -3.66  
#> 6 -2.76  -0.923  1.64  -0.0119

The generated data has independent dimensions with specified means (m) and standard deviations (s).

Randomizing rows

randomize_rows() ensures the rows of the input data is randomized.

randomized_data <- randomize_rows(bkg_data)
head(randomized_data)
#> # A tibble: 6 × 4
#>       x1     x2     x3     x4
#>    <dbl>  <dbl>  <dbl>  <dbl>
#> 1 -0.124  3.51   1.74  -1.97 
#> 2  4.42   2.58  -0.470  0.641
#> 3  1.54  -0.301  1.02   4.58 
#> 4  0.457 -0.156  2.28  -0.977
#> 5  1.26   2.97   0.794  0.905
#> 6  6.49  -1.16  -1.11   0.543

Relocating clusters

relocate_clusters() allows users to translate clusters in any dimension(s). This is achieved by centering each cluster (subtracting its mean) and then adding a translation vector from a provided matrix (vert_mat).

df <- tibble::tibble(
  x1 = rnorm(12),
  x2 = rnorm(12),
  x3 = rnorm(12),
  x4 = rnorm(12),
  cluster = rep(1:3, each = 4)
)

vert_mat <- matrix(c(
  5, 0, 0, 0,
  0, 5, 0, 0,
  0, 0, 5, 0
), nrow = 3, byrow = TRUE)

relocated_df <- relocate_clusters(df, vert_mat)
head(relocated_df)
#> # A tibble: 6 × 5
#>       x1       x2     x3     x4 cluster
#>    <dbl>    <dbl>  <dbl>  <dbl>   <int>
#> 1  0.789  0.637    5.84   0.721       3
#> 2  5.61  -0.770    0.269 -1.34        1
#> 3  0.834  5.09    -0.621 -1.16        2
#> 4 -0.592 -0.00646  4.12  -0.196       3
#> 5  0.717  1.27     5.80   0.427       3
#> 6 -0.967  5.25     1.54   0.274       2

Generating Rotation Matrices

The gen_rotation() function creates a rotation matrix in high-dimensional space for given planes and angles.


rotations_4d <- list(
  list(plane = c(1, 2), angle = 60),
  list(plane = c(3, 4), angle = 90)
)

rot_mat <- gen_rotation(p = 4, planes_angles = rotations_4d)
rot_mat
#>           [,1]       [,2]         [,3]          [,4]
#> [1,] 0.5000000 -0.8660254 0.000000e+00  0.000000e+00
#> [2,] 0.8660254  0.5000000 0.000000e+00  0.000000e+00
#> [3,] 0.0000000  0.0000000 6.123234e-17 -1.000000e+00
#> [4,] 0.0000000  0.0000000 1.000000e+00  6.123234e-17

Normalize data

When combining clusters or transforming data geometrically, magnitudes can differ drastically. The normalize_data() function rescales the entire dataset to fit within ([-1, 1]) based on its maximum absolute value.

norm_data <- normalize_data(bkg_data)
head(norm_data)
#>            x1          x2          x3           x4
#> 1 -0.19824950  0.26597594  0.29315698 -0.275567334
#> 2  0.09111172  0.36718998  0.35115760 -0.179721292
#> 3  0.36664451 -0.24519698  0.27529407 -0.048461095
#> 4  0.71300432  0.28227563 -0.03402328 -0.217801905
#> 5 -0.03189621 -0.08982008  0.12103091 -0.522438279
#> 6 -0.39364143 -0.13163195  0.23376883 -0.001695661

Generating cluster locations

To place clusters in different positions, gen_clustloc() generates points forming a simplex-like arrangement ensuring each cluster center is equidistant from others as much as possible.


centers <- gen_clustloc(p = 4, k = 5)
head(centers)
#>             [,1]       [,2]        [,3]       [,4]       [,5]
#> [1,]  0.26983778 -0.4294511 -0.17898562  0.8220179 -0.4834190
#> [2,]  0.03526906  0.6685061 -0.26238277 -0.9888410  0.5474486
#> [3,] -0.88656467  1.2648195 -0.20908822 -0.9657047  0.7965381
#> [4,]  2.20580849 -1.0348574  0.02853651 -1.4893315  0.2898438

Numeric generators

Two helper functions, gen_nproduct() and gen_nsum(), generate numeric vectors of positive integers that approximately satisfy a user-specified target product or sum, respectively.

The function gen_nsum(n, k) divides a total sum n into k positive integers. It first assigns an equal base value to each element and then randomly distributes any remainder, ensuring the elements sum exactly to n.

gen_nsum(n = 100, k = 3)
#> [1] 34 33 33

The function gen_nproduct(n, p) aims to produce p positive integers whose product is approximately n. It starts with all elements equal to the rounded \(p^{th}\) root of n and iteratively adjusts elements up or down in a randomized manner until the product is within a small tolerance of n. This accommodates the fact that exact integer solutions for a given product are often impossible.

gen_nproduct(n = 500, p = 4)
#> [1] 4 5 5 5