logo
down
shadow

What are the centroid of k-means clusters with PCA decomposition?


What are the centroid of k-means clusters with PCA decomposition?

Content Index :

What are the centroid of k-means clusters with PCA decomposition?
Tag : python , By : user118656
Date : December 01 2020, 05:00 PM

This might help you There are two ways to do what you ask.
You can get the nearest approximation of the centers in the original feature space using PCA's inverse transform:
centers = pca.inverse_transform(kmeans.cluster_centers_)
print(centers)

[[ 6.82271303  3.13575974  5.47894833  1.91897312]
 [ 5.80425955  2.67855286  4.4229187   1.47741067]
 [ 5.03012829  3.42665848  1.46277424  0.23661913]]
for label in range(kmeans.n_clusters):
    print(X[kmeans.labels_ == label].mean(0))

[ 6.8372093   3.12093023  5.4627907   1.93953488]
[ 5.80517241  2.67758621  4.43103448  1.45689655]
[ 5.01632653  3.44081633  1.46734694  0.24285714]
print(KMeans(3).fit(X).cluster_centers_)

[[ 6.85        3.07368421  5.74210526  2.07105263]
 [ 5.9016129   2.7483871   4.39354839  1.43387097]
 [ 5.006       3.418       1.464       0.244     ]]

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

Efficient query to locate centroid of clusters in postgis?


Tag : sql , By : MP.
Date : March 29 2020, 07:55 AM
seems to work fine To take advantage of the spatial index you could use ST_DWithin. What is you search space? Can the centroid be anywhere in space?

Get nearest centroid using Thrust library? (K-Means)


Tag : cpp , By : user181706
Date : March 29 2020, 07:55 AM
may help you . I already finished computing the distances and stored in a thrust vector, for instance, I have 2 centroids and 5 datapoints and the way I computed the distances was that for each centroid I computed the distances with the 5 datapoints first and stored in the array and later with the other centroid in a 1d array in distances, just like this: , Here is one possible approach:
DistancesValues = {10, 15, 20, 12, 10, 5, 17, 22,  8, 7}
DatapointsIndex = {1,   2,  3,  4,  5, 1,  2,  3,  4, 5}
CentroidIndex   = {1,   1,  1,  1,  1, 2,  2,  2,  2, 2}
DatapointsIndex = {1, 1, 2, 2, 3, 3, 4, 4, 5, 5} 
$ cat t428.cu
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/sort.h>
#include <thrust/reduce.h>
#include <thrust/copy.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/iterator/discard_iterator.h>
#include <stdio.h>
#define NUM_POINTS 5
#define NUM_CENTROID 2
#define DSIZE (NUM_POINTS*NUM_CENTROID)

int main(){

  int DistancesValues[DSIZE] = {10, 15, 20, 12, 10, 5, 17, 22, 8, 7};
  int DatapointsIndex[DSIZE] = {1, 2,  3,   4,  5,  1,  2,  3, 4, 5};
  int CentroidIndex[DSIZE]   = {1, 1, 1, 1, 1, 2, 2, 2, 2, 2};

  thrust::device_vector<int> DV(DistancesValues, DistancesValues + DSIZE);
  thrust::device_vector<int> DI(DatapointsIndex, DatapointsIndex + DSIZE);
  thrust::device_vector<int> CI(CentroidIndex, CentroidIndex + DSIZE);
  thrust::device_vector<int> Ra(NUM_POINTS);
  thrust::device_vector<int> Rb(NUM_POINTS);

  thrust::sort_by_key(DI.begin(), DI.end(), thrust::make_zip_iterator(thrust::make_tuple(DV.begin(), CI.begin())));
  thrust::reduce_by_key(DI.begin(), DI.end(), thrust::make_zip_iterator(thrust::make_tuple(DV.begin(), CI.begin())), thrust::make_discard_iterator(), thrust::make_zip_iterator(thrust::make_tuple(Ra.begin(), Rb.begin())), thrust::equal_to<int>(), thrust::minimum<thrust::tuple<int, int> >());
  printf("CountOfCentroid 1 = %d\n", thrust::count(Rb.begin(), Rb.end(), 1));
  printf("CountOfCentroid 2 = %d\n", thrust::count(Rb.begin(), Rb.end(), 2));

  return 0;
}

$ nvcc -arch=sm_20 -o t428 t428.cu
$ ./t428
CountOfCentroid 1 = 2
CountOfCentroid 2 = 3
$

Sklearn: find mean centroid location for clusters?


Tag : python , By : damomurf
Date : March 29 2020, 07:55 AM
I hope this helps . The docs of sklearn.decomposition.NMF explain how to get the coordinates of the centroid of each cluster:
In [995]: np.set_printoptions(precision=2)

In [996]: nmf.components_
Out[996]: 
array([[ 0.54,  0.91,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.89,  0.  ,  0.89,  0.37,  0.54,  0.  ,  0.54],
       [ 0.  ,  0.01,  0.71,  0.  ,  0.  ,  0.  ,  0.71,  0.72,  0.71,  0.01,  0.02,  0.  ,  0.71,  0.  ],
       [ 0.  ,  0.01,  0.61,  0.61,  0.61,  0.61,  0.  ,  0.  ,  0.  ,  0.62,  0.02,  0.  ,  0.  ,  0.  ]])

How to calculate the distance between a document and each centroid (k-means)?


Tag : python , By : Sonal
Date : March 29 2020, 07:55 AM
hope this fix your issue You can use the method predict to get the closest cluster for each sample in a matrix X:
from sklearn.cluster import KMeans

model = KMeans(n_clusters=K)
model.fit(X_train)
label = model.predict(X_test)

Can we rank K-Means clusters or assign weights to certain clusters?


Tag : python , By : Michael
Date : March 29 2020, 07:55 AM
this one helps. One "cheat" trick would be to use the feature ratingtwice or three times, then it automatically gets more weight:
data = np.asarray([np.asarray(dataset['Rating']), np.asarray(dataset['Rating']), np.asarray(dataset['Maturity']),np.asarray(dataset['Score']),np.asarray(dataset['Bin']),np.asarray(dataset['Price1']),np.asarray(dataset['Price2']),np.asarray(dataset['Price3'])]).T
Related Posts Related QUESTIONS :
  • mp4 metadata not found but exists
  • Django: QuerySet with ExpressionWrapper
  • Pandas string search in list of dicts
  • Decryption from RSA encrypted string from sqlite is not the same
  • need of maximum value in int
  • a list of several tuples, how to extract the same of the first two elements in the small tuple in the large tuple
  • Display image of 2D Sinewaves in 3D
  • how to prevent a for loop from overwriting a dictionary?
  • How To Fix: RuntimeError: size mismatch in pyTorch
  • Concatenating two Pandas DataFrames while maintaining index order
  • Why does this not run into an infinite loop?
  • Python Multithreading no current event loop
  • Element Tree - Seaching for specific element value without looping
  • Ignore Nulls in pandas map dictionary
  • How do I get scrap data from web pages using beautifulsoup in python
  • Variable used, golobal or local?
  • I have a regex statement to pull all numbers out of a text file, but it only finds 77 out of the 81 numbers in the file
  • How do I create a dataframe of jobs and companies that includes hyperlinks?
  • Detect if user has clicked the 'maximized' button
  • Does flask_login automatically set the "next" argument?
  • Indents in python 3
  • How to create a pool of threads
  • Pandas giving IndexError on one dataframe but not on another similar dataframe
  • Django Rest Framework - Testing client.login doesn't login user, ret anonymous user
  • Running dag without dag file in airflow
  • Filling across a specified dimension of a numpy array
  • Python populating dataframe in pandas from text files
  • How to interpolate a single ("non-piecewise") cubic spline from a set of data points?
  • Divide 2 integers (leetcode 29) - recursion issue
  • Can someone explain why do I get this output in Python?
  • How do I scrape pdf and html from search results without obvious url
  • Is there a way to automatically make a "collage" of plots with matplotlib?
  • How to combine multiple rows in pandas with shared column values
  • How do I get LOAD_CLASSDEREF instruction after dis.dis?
  • Django - How to add items to Bootstrap dropdown?
  • Linear Regression - Does the below implementation of ridge regression finding coefficient term using gradient method is
  • How to drop all rows in pandas dataframe with negative values?
  • Most Efficient Way to Find Closest Date Between 2 Dataframes
  • Execution error when Passing arguments to a python script using os.system. The script takes sys.argv arguments
  • Looping through a function
  • Create a plot for each unique ID
  • a thread python with 'while' got another thread never start
  • Solution from SciPy solve_ivp contains oscillations for a system of first-order ODEs
  • trigger python events driven by selenium controlled browser
  • Passing line-edits to a contextmanager to set validators
  • Python: globals().items() iterations try to change a dict
  • Is it possible to specify starting values for each parameter (instead of bounds) for scipy's differential evolution?
  • why datetime.now() and constructed datetime using all fields(like year,month...) of now has big timedelta?
  • MySQL multiple table UPDATE query using sqlalchemy core?
  • find if a semantic version is superset of of another version python
  • Type checking against dynamically created objects
  • Struggling with simple reverse function
  • Is there a function for finding the midpoint of n points on sklearn.neighbors.NearestNeighbors?
  • How to set max number of tweets to fetch
  • PYTHON 3.7.4 NOT USING SQLITE 3.29.0
  • How to replace Nan value with zeros in a numpy array?
  • How to speed up calculating variance among sparse matrix
  • cupy code is not fast enough compared with numpy
  • How to count frequency of select values in Python pandas dataframe
  • Scrape Span Text from Google
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com