2
$\begingroup$

I want to run some machine learning algorithms such as PCA and KNN with a relatively large dataset of images (>2000 rgb images) in order to classify these images.

My source code is the following:

import cv2 import numpy as np import os from glob import glob from sklearn.decomposition import PCA from sklearn import neighbors from sklearn import preprocessing data = [] # Read images from file for filename in glob('Images/*.jpg'): img = cv2.imread(filename) height, width = img.shape[:2] img = np.array(img) # Check that all my images are of the same resolution if height == 529 and width == 940: # Reshape each image so that it is stored in one line img = np.concatenate(img, axis=0) img = np.concatenate(img, axis=0) data.append(img) # Normalise data data = np.array(data) Norm = preprocessing.Normalizer() Norm.fit(data) data = Norm.transform(data) # PCA model pca = PCA(0.95) pca.fit(data) data = pca.transform(data) # K-Nearest neighbours knn = neighbors.NearestNeighbors(n_neighbors=4, algorithm='ball_tree', metric='minkowski').fit(data) distances, indices = knn.kneighbors(data) print(indices) 

However, my laptop is not sufficient for this task as it needs many hours in order to process more than 700 rgb images. So I need to use the computational resources of an online platform (e.g. like the ones provided by GCP).

Can I simply make a call from Pycharm to Compute Engine API (after a I have created a virtual machine in it) to run my python script?

Or is it possible either to install PyCharm in the virtual machine and run the python script in it or to write my source code in a docker container?

In all, how can I simply run a python script on GCP Compute Engine without wasting time in needless things?

$\endgroup$

    2 Answers 2

    1
    $\begingroup$

    First, you will need to install Cloud SDK: https://cloud.google.com/sdk/downloads#apt-get

    Then, the simplest way is to run your script through your Terminal (mac and I presume the instructions also work on Linux):

    1. Configure your project: gcloud config set project insert_your_project_name
    2. Set up SSH keys: gcloud compute config-ssh
    3. Connect to the VM: gcloud beta compute ssh vm_name --internal-ip
    4. Run script: python your_script.py

    You can also connect PyCharm directly to GCP and run everything on your VM but you will need PyCharm Pro, otherwise the deployment option is not available. Let me know if this works.

    Also, if you want to use the interactive version of setting your project then in step 1 do this instead: gcloud init

    $\endgroup$
    7
    • 2
      $\begingroup$I appreciate that you're trying to help the asker, but I don't think you should answer an off-topic question like this one. To do so is to encourage more of them, and we've had a long-running problem of people thinking that this site is an appropriate place to ask things like "How do I do a $t$-test in Excel?". Flag or vote the question to be closed or moved to Stack Overflow instead.$\endgroup$
      – Kodiologist
      CommentedFeb 1, 2018 at 18:01
    • $\begingroup$Got it, apologies!$\endgroup$CommentedFeb 1, 2018 at 19:21
    • $\begingroup$Thank you for your answer @francium87d. I will try your suggestion and indeed it seems that this is one of the most simple ways to run a python script (from my laptop) on GCP compute engine. Where exactly did you find out how to do specifically this? I am asking you because I have not found any informative documentation about GCP on Internet...$\endgroup$
      – Outcast
      CommentedFeb 2, 2018 at 9:19
    • $\begingroup$@Universalis we use VMs to run scripts at my company so I have gone through this process before.$\endgroup$CommentedFeb 2, 2018 at 9:55
    • $\begingroup$Ok, so you mean that someone has told you this and you did not exactly find it on Internet...Because I at least did not find it yet...$\endgroup$
      – Outcast
      CommentedFeb 2, 2018 at 10:03
    0
    $\begingroup$

    The other option is to setup jupyter notebook on GCP. You can use the following command to run jupyter notebook in background.

    nohup jupyter notebpook --ip=0.0.0.0 & 

    Now you can do tunneling by doing an ssh into GCP:

    ssh username@<public_ip> -L 8888:127.0.0.1:8888 

    Now you should be able to access jupyter notebook from your local machine with the following url in the browser

    127.0.0.1:8888 
    $\endgroup$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.