Cosine similarity returning wrong distance
Tag : java , By : Jay Crockett
Date : March 29 2020, 07:55 AM
To fix this issue Your code is fine. The vectors are dominated by several large features. In those features, the two vectors are almost collinear, which is why the similarity measure is close to 1. feature vec1 vec2 vec2/vec1
64806110 2875 1.85E+07 6.43E+03
64806108 5750 3.68E+07 6.40E+03
64806107 8625 5.49E+07 6.37E+03
64806106 11500 7.29E+07 6.34E+03
64806111 14375 9.07E+07 6.31E+03
64806109 17250 1.08E+08 6.28E+03
|
Choice between an adjusted cosine similarity vs regular cosine similarity
Date : March 29 2020, 07:55 AM
Hope this helps Why would a regular cosine similarity result in a positive number for such 'different' items? from scipy import spatial
import numpy as np
a = np.array([2.0,1.0])
b = np.array([5.0,3.0])
1 - spatial.distance.cosine(a,b)
#----------------------
# 0.99705448550158149
#----------------------
c = np.array([5.0,4.0])
1 - spatial.distance.cosine(c,b)
#----------------------
# 0.99099243041032326
#----------------------
mean_ab = sum(sum(a,b)) / 4
# mean_ab : 3.5
# adjusted vectors : [-1.5, -2.5] , [1.5, -0.5]
1 - spatial.distance.cosine(a - mean_ab, b - mean_ab)
#----------------------
# -0.21693045781865616
#----------------------
mean_cb = sum(sum(c,b)) / 4
# mean_cb : 6.5
# adjusted vectors : [-1.5, -3.5] , [-1.5, -2.5]
1 - spatial.distance.cosine(c - mean_cb, b - mean_cb)
#----------------------
# 0.99083016804429891
#----------------------
|
Pairwise Cosine Similarity using TensorFlow
Date : March 29 2020, 07:55 AM
will be helpful for those in need There is an answer for getting a single cosine distance here: https://stackoverflow.com/a/46057597/288875 . This is based on tf.losses.cosine_distance . Here is a solution which does this for matrices: import tensorflow as tf
import numpy as np
with tf.Session() as sess:
M = 3
# input
input = tf.placeholder(tf.float32, shape = (M, M))
# normalize each row
normalized = tf.nn.l2_normalize(input, dim = 1)
# multiply row i with row j using transpose
# element wise product
prod = tf.matmul(normalized, normalized,
adjoint_b = True # transpose second matrix
)
dist = 1 - prod
input_matrix = np.array(
[[ 1, 1, 1 ],
[ 0, 1, 1 ],
[ 0, 0, 1 ],
],
dtype = 'float32')
print "input_matrix:"
print input_matrix
from sklearn.metrics.pairwise import pairwise_distances
print "sklearn:"
print pairwise_distances(input_matrix, metric='cosine')
print "tensorflow:"
print sess.run(dist, feed_dict = { input : input_matrix })
input_matrix:
[[ 1. 1. 1.]
[ 0. 1. 1.]
[ 0. 0. 1.]]
sklearn:
[[ 0. 0.18350345 0.42264974]
[ 0.18350345 0. 0.29289323]
[ 0.42264974 0.29289323 0. ]]
tensorflow:
[[ 5.96046448e-08 1.83503449e-01 4.22649741e-01]
[ 1.83503449e-01 5.96046448e-08 2.92893231e-01]
[ 4.22649741e-01 2.92893231e-01 0.00000000e+00]]
|
Computing the Cosine Similarity of two sets of vectors in Tensorflow
Tag : python , By : user179271
Date : March 29 2020, 07:55 AM
wish help you to fix your issue Hi fellow Stackoverflow users, , You can compute that simply like this: import tensorflow as tf
# Vectors
a = tf.placeholder(tf.float32, shape=[600, 52])
b = tf.placeholder(tf.float32, shape=[16000, 52])
# Cosine similarity
similarity = tf.reduce_sum(a[:, tf.newaxis] * b, axis=-1)
# Only necessary if vectors are not normalized
similarity /= tf.norm(a[:, tf.newaxis], axis=-1) * tf.norm(b, axis=-1)
# If you prefer the distance measure
distance = 1 - similarity
|
Tensorflow cosine similarity between each tensor in a list
Date : March 29 2020, 07:55 AM
will help you I have 2 list(array) with tensors and want to calculate cosine similarity of the tensors between two lists. And get output list(tensor) with similarities. a = tf.placeholder(tf.float32, shape=[None,3], name="input_placeholder_a")
b = tf.placeholder(tf.float32, shape=[None,3], name="input_placeholder_b")
numerator = tf.reduce_sum(tf.multiply(a, b), axis=1)
denominator = tf.multiply(tf.norm(a, axis=1), tf.norm(b, axis=1))
cos_similarity = numerator/denominator
sess=tf.Session()
cos_sim=sess.run(cos_similarity,feed_dict={
a: np.array([[1, 2, 3],
[4, 5, 6]]),
b: np.array([[1, 2, 3],
[8, 7, 9]]),
})
print(cos_sim)
|