ML101 – pt. 09: Reducing our Data – the PCA Node

comments 4
Part of: Machine Learning 101
Machine Learning 101, Premium Course
To view this content, you must be a member of Entagma's Patreon at $29 or more
Unlock with Patreon
Already a qualifying Patreon member? Refresh to access this content.

4 Comments

  1. I did have a quick questions I am hoping you can answer?

    Why does the pca node in analyze mode produce the following number of components when Number Of Components is set to 0 (fully represented):

    Number Of Components = pow(Points Per Sample, 2)*(length of the Attributes)

    • Hey Yancy, let’s take a look at the box example from the PCA SOP Exploration hip file with 8 points per sample and an attrib length of 3. In your formula it looks like this: pow(8,2)*3 = 192. Let’s rephrase that a bit: We want to build a new coordinate system with 8*3=24 axes. Since we want to express every axis as a high dimensional vector, each of our 24 axes also needs 24 dimensions. So our total number of floats is equal to pow(24,2)=576. Since we already have a P attrib that can store 3 floats, let’s distribute them over all P attribs so 576/3=192. And here’s your answer again.

      Let’s try that now with 8 points per sample and position and color: Now we have 8*(3+3) axes and 8*(3+3) dimensions per axis, so 2304 floats in total. But now we also have both a P and a CD attrib on each point that we can use to store those floats, so 2304/6 = 384 = pow(8,2)*6.

      So in the end the formula for the number of component axes is just “Points Per Sample * Length of Attribs”. The formula for the number of points needed is “pow(number of component axes, 2)/Length of Attribs” – So your formula, just less condensed.

  2. Hi Chris. Thank you for making this very useful video.
    I have a question. What is the midpoint you refer to at the 18:41 in this video?
    I thought the `Number of Components` was 128 because there are 128 poses from frame 1 to frame 128.

    • Hey! At this point in the video, we’re dealing with very high dimensional data. If you take look at 04:00 in the video, you’ll get a better idea for this midpoint. We want to take the entries of our dataset (points in the first example, poses in the second) and place them in a new coordinate system, that pca should build for us. Our first coordinate system will have a maximum of 3 axes (1 point * 3 coordinates per point) + one midpoint, the origin point of our coordinate system. Our second coordinate system could have a maximum of 70.000 axes (30K points * 3 coordinates per pose) + again that origin point. But since we can lower the number of axes quite a bit without losing to much info, I chose 127 axes + one origin point, so in the end 128 values for our neural net.

Leave a Reply