GT Home : : Campus Maps : : GT Directory

Archive for October, 2013

CDA Lecture: Python and the Future of Data Analysis

Posted by on Thursday, 17 October, 2013

Speaker:         Peter Wang, co-Founder of Continuum Analytics

Date:                 Friday, October 18, 2013

Location:        Klaus 1447

Time:                 2-3pm

 

Abstract:
While Python has been a popular and powerful language for scientific computing for a while now, its future in the broader data analytics realm is less clear, especially as market forces and technological innovation are rapidly transforming the field.

In this talk, Peter will introduce some new perspectives on “Big Data” and the evolution of programming languages, and present his thesis that Python has a central role to play in the future of not just scientific computing, but in analytics and even computing in general. As part of the discussion, many new libraries, tools, and technologies will be discussed (both Python and non-Python), both to understand why they exist and where they are driving technical evolution.

Bio:
Peter holds a B.A. in Physics from Cornell University and has been developing applications professionally using Python since 2001. Before co-founding Continuum Analytics in 2011, Peter spent seven years at Enthought designing and developing applications for a variety of companies, including investment bankers, high-frequency trading firms, oil companies, and others. In 2007, Peter was named Director of Technical Architecture and served as client liaison on high-profile projects. Peter also developed Chaco, an open-source, Python-based toolkit for interactive data visualization. Peter’s roles at Continuum Analytics include product design and development, software management, business strategy, and training.

October 2013 PACE maintenance complete

Posted by on Thursday, 17 October, 2013
Greetings!We have completed our maintenance activities for October.  All clusters are open, and jobs are flowing.  We came across (and dealt with) a few minor glitches, but I’m very happy to say that no major problems were encountered.  As such, we were able to accomplish all of our goals for this maintenance window.

  • All project storage servers have had their operating systems updated.  This should protect from failures during high load.  Between these fixes, and the networking fixes below, we believe all of the root causes of storage problems we’ve been having recently are resolved.
  • All of our redundancy changes and code upgrades to network equipment have been completed.
  • The decentralization of job scheduling services has been completed.  You should see significantly improved responsiveness when submitting jobs or checking on the status of existing jobs.
    • The decentralization of job scheduling services has been completed.  You should see significantly improved responsiveness when submitting jobs or checking on the status of existing jobs.  Please note that you will likely need to resubmit jobs that did not have a chance to run before Tuesday.  Contrary to previously announced and intended designs, this affects the shared clusters as well.  We apologize for the inconvenience.
    • Going forward, the scheduler decentralization has a notable side effect.  Previously, any login node could submit jobs to any queue, as long as the user had access to do so.  Now, this may no longer be the case.
    • For instance, a user of the dedicated cluster “Optimus” that also had access to the FoRCE, could simply submit jobs to the force queue from the optimus head node.  Now, That user will no longer be able to do so, as Optimus and FoRCE are scheduled by different servers.
    • We believe that these cases should be quite uncommon.  If you do encounter this situation, you should be able to simply login to the other head node and submit your jobs from there.  You will have the same home, project and scratch directories from either place.  Please let us know if you have problems.
  • All RHEL6 clusters now have access to our new GPFS filesystem.  Additionally, all of the applications in /usr/local (matlab, abacus, PGI compilers, etc.) have been moved to this storage.  This should provide performance improvements for these applications as well as the Panasas scratch storage, which was the previous location of this software.
  • Many of our virtual machines have been moved to different storage.  This should provide an improvement in the responsiveness of your login nodes.  Please let us know (via pace-support@oit.gatech.edu) if you see undesirable performance from your login nodes.
  • The Atlantis cluster has been upgraded from RHEL5 to RHEL6 (actually, this happened before this week), and 31 Infiniband-connected nodes from the RHEL5 side of the Atlas cluster have been upgraded to RHEL6.  (The 32nd has hardware problems and has been shut down.)
  • The /nv/pf2 project filesystem has been migrated to a server with more breathing room.

Additionally, we were able to complete a couple of bonus objectives.

  • You’ll notice a new message when logging in to your clusters.  Part of this message is brought to you from our Information Security department, and the rest is intended to give a high-level overview of the specific cluster and the queue commonly associated with it.
  • Infiniband network redundancy for the DDN/GPFS storage.
  • The /nv/pase1 filesystem was moved off of temporary storage, and onto the server purchased for the Ase1 cluster.