Example - Vector Recommendations With NBA Players

With the growth of the NBA quickly introducing new way to calculate statistics that are human-calculated, it doesn’t make much sense to keep this using traditional human-orientated approaches and judgements based on players.

Here, we want to be able to identify players that have similar statistics to other players without having to look at each statistic line by line. We want to just know.

Below, we investigate the different players and their most similar counterparts as well as examine the efficiencies between the different players.

nba_per_36 = pd.read_excel('data/nba_per_36.xlsx', skiprows=[0])
nba_per_game = pd.read_excel('data/nba_per_game.xlsx')
151 Angel Delgado Lac C 24.39 2 7.4 15.4 16.8 0.0 2 0.500 5 0.200 0 0.000 0.200 0.255 1.5 2.0 14.3 0.0 0.0 0.50 0.00 0.00 0.0 79.2 98.9
587 John Wall Was G 28.60 32 34.5 71.9 28.8 16.3 175 0.697 382 0.508 169 0.302 0.491 0.528 20.7 3.6 5.7 8.7 39.2 1.53 0.91 3.81 10.0 104.1 111.5
from vectorai import ViClient
vi_client = ViClient(username, api_key, url)
Logged in. Welcome jacky-wong. To view list of available collections, call list_collections() method.
from sklearn.preprocessing import StandardScaler

def create_collection(df, collection_name):
    df = df.fillna(0)
    scaler = StandardScaler()
    season_vector = scaler.fit_transform(df.drop(['FULL NAME', 'TEAM', 'POS', 'AGE', 'MPG'], axis=1))
    df['season_vector_'] = season_vector.tolist()
    if collection_name in vi_client.list_collections():
    return vi_client.insert_df(collection_name, df, chunksize=100)

create_collection(nba_per_game, 'nba_season_per_game_stats_demo')

{'inserted_successfully': 622, 'failed': 0, 'failed_document_ids': []}
create_collection(nba_per_36, 'nba_season_per_36_stats_demo')

{'inserted_successfully': 212, 'failed': 0, 'failed_document_ids': []}

Visualising NBA players

job = vi_client.dimensionality_reduction_job('nba_season_per_game_stats_demo', vector_field='season_vector_', n_components=2)
job = vi_client.dimensionality_reduction_job('nba_season_per_36_stats_demo', vector_field='season_vector_', n_components=2)
vi_client.wait_till_jobs_complete('nba_season_per_36_stats_demo', **job)
{'status': 'Finished'}
# rename cluster_field to vector_field
fig = vi_client.plot_dimensionality_reduced_vectors(
    collection='nba_season_per_game_stats_demo', point_label='FULL NAME', cluster_field='season_vector_', cluster_label='POS',
    title="NBA Players Stats Per Gmae",