Director's Blog
2009 April

April 28, 2009

Google Books

Filed under: tech — Tom Holub @ 3:42 pm

I wrote the following in response to an email I received from a department chair; I thought it would be of general interest.  The subject is Google Books, and UCOP’s endorsement of a settlement agreement for the class-action lawsuit against them.  (Google is being sued for violating copyrights by scanning and publishing books which are in copyright, but out of print).

Relevant articles:

I’ll note that this isn’t really in my area of expertise.  I think there are reasonable arguments to be made on both sides of the issue.  (See, for example, Courant’s response to Darnton’s article, and Darnton’s response to that: http://www.nybooks.com/articles/22496).

Academics have competing needs related to copyright.  On the one hand, universities and libraries can almost universally applaud greater access to public-domain works.  Easy access to digital versions of the chart of Beethoven’s 9th or the works of Isaac Newton provide great benefit to instruction and research, without impinging on copyright.  [There is a distinction between works whose essential character can be easily duplicated, such as books or music, and other media such as sculpture or painting.  The Louvre may not exactly claim copyright on the Mona Lisa, but they won't let you go in and take a high-quality digital image of it without paying a fee.]

Even digitizing the works of Beethoven or Newton has an effect; publishers who might otherwise produce new printed versions might be less likely to do so, because the size of their market has been reduced by the easy availability of electronic versions.  Still, I think most academics would agree that the overall benefit to the public of having the electronic versions available is the primary consideration.

The Google Books project goes a step further, by digitizing copyrighted works which are currently out of print.  This is very much aligned with Google’s corporate philosophy of collecting and providing as much information as they possibly can.  In some ways it’s a clear public benefit–people all over the world can get access to books that they aren’t able to buy–but the copyright holders are understandably concerned.  Just because something’s out of print now doesn’t mean it will be out of print forever–except that once it’s in Google Books, it’s probably less likely to get reprinted.  This is part of the academic objection, since many of our faculty write books which could end up in Google Books.  I think it’s a real effect, but I also think that many faculty would choose to have more readers of their work, even if it meant fewer actual book sales.

The concern that Google will disadvantage universities the way that the journal companies have, I think is largely unfounded.  It’s true that we don’t know what Google will do in the future, and corporate interests are often not aligned with academic interests.  But Google’s corporate philosophy is bound (no pun intended) to the concept of free content; I can’t imagine them charging thousands of dollars for content access to any of their properties, including Google Books.

Then there’s the underlying concern of the library, that Google Books and services like it could threaten the library by making it appear obsolete.  I think this is a real possibility, but it’s a real possibility no matter what happens with the Google Books situation. Right now, the technology to read books electronically is very immature and only marginally usable, but if someone comes up with a good e-book solution, the library as we know it will have to change radically to remain relevant.

April 27, 2009

Spam, spam, spam, spam…

Filed under: tech — Tom Holub @ 3:52 pm

We’ve received numerous reports from our customers about an apparent increase in spam in recent weeks.  Indeed, looking at spam statistics shows that spam activity has risen significantly in the past couple of months; see the charts at MessageLabs, which are complete through March.  I’m guessing that April will be even higher.  They list the spam rate as 75.7%; over three-quarters of all the mail sent on the Internet is spam.

Now, our spam filters are generally pretty good about catching spam; probably 80-90% of the total spam volume is caught before users see it.  But spam protection is always an arms race, and right now the spammers have come up with some new techniques which are succeeding at getting around many of the spam filters.  Getting around spam filters is a bounded problem; blocking spam is an unbounded problem.

We have a page with some suggestions on how to deal with spam.  One of those is that you can forward a spam message to spam@berkeley.edu to report it as spam; Calmail will use your message to help improve their spam filters.  (The message has to be forwarded as an attachment).

We’re also in the process of moving all of our mail service over to Calmail.  Calmail allows us to host a mail domain such as LS.berkeley.edu on their own servers.  They’re a much bigger operation, and they have many more resources to put into spam protection.  We’ve moved the math.berkeley.edu domain over, and are initiating projects to migrate the rest of the domains we run on departmental servers.  We expect that migrating to Calmail will reduce everyone’s spam counts, and provide more reliable service as well.  (See my post on high availability).

April 3, 2009

Update on network funding

Filed under: administrative, network — Tom Holub @ 4:48 pm

As noted in an earlier blog entry, the campus is moving towards an FTE-based model for network funding.  Thursday I participated in the latest meeting of the advisory committee, and we got a lot more detail about the plan.

The biggest news is that implementation has been delayed until January 2010.  The committee and the Cabinet are in agreement that the plan is not ready for implementation this coming July 1; there are too many open questions and logistical problems.  So for at least the first six months of the fiscal year, network charges will be the same as they currently are.

Also big news for certain L&S departments is that completion of campus infrastructure projects, including ubiquitous AirBears, is included in the funding model.  That means that some of the buildings with the worst networking, such as Tolman, Wheeler, and Kroeber, will receive network upgrades as part of the adoption of the new model.  Timelines are unclear, and there will probably still be a required departmental contribution for new horizontal cables, but this is definitely good news for our most underserved departments.

Undergraduate students are now included in the funding model at a 0.15 FSE rate.  The proposal is to pay for students via a student technology fee; the campus is working with UCOP (and probably going to the Regents) to implement such a fee on the January 2010 timeframe.  [Not knowing anything about the politics involved, I'd have to say that timeframe looks unrealistic to me, but we'll see.]

The inclusion of students in the model has brought the cost per FSE down below $40/month/FSE, from original projections of $45/month.  The figure we are currently seeing is $38.08/month; that is subject to change due to negotiations over FSE counts, but it should remain below $40.

The committee’s strong preference is to use the existing $1.3M in funding allocation to pay the monthly fees for academic titles and GSIs.  It appears that $1.3M is roughly equivalent to the amount needed to cover those groups, and it looks to me like that’s the direction the campus is going to take.

The overall impact to L&S will be significant in any case.  According to draft figures shared with us, L&S departments in total are currently spending approximately $14K/month for network charges; under the new model (with no funding allocations included), L&S costs rise to $104K/month–an increase of $90K/month, $1.08M/year.  If L&S garners half of the $1.3M funding allocation (a reasonable guess based on faculty FTE counts), that still leaves us with additional costs of nearly $500K/year.

They gave us projected cost breakdowns by department; here are some examples of monthly costs from a large and a small department in each  division (again, this is before any funding is allocated):

  • MCB: Current $969, projected $18,084, increase $17,115/month
  • IB: Current $1,230, projected $7,501, increase $6,271/month
  • Physics: Current $1,486, projected $7,947, increase $6,461/month
  • Statistics: Current $192, projected $1,656, increase $1,447/month
  • History: Current $410, projected $4,128, increase $3,718/month
  • Geography: Current $96, projected $912, increase $816/month
  • English: Current $297, projected $3,577, increase $3,280/month
  • Scandinavian: Current $56, projected $443, increase $387/month
  • UGIS: Current $306, projected $2,495, increase $2,189/month
  • L&S Advising: Current $198, projected $1,644, increase $1,447/month

It’s obvious that impacts at these levels would be devestating to departments.  Funding allocation should bring the direct costs down quite a bit; my guess is that the proposal would cut 50-75% off the above numbers for most academic departments, depending on the makeup of departmental FTE.  Departments whose FTE are mostly in academic titles and GSIs would benefit the most from the proposed funding allocations; departments with many administrative staff and GSR positions would benefit less.  Fully administrative departments (such as mine) will likely be paying the full costs.  Even with funding allocations, many departments will struggle to find funding to pay these charges.

Thursday’s meeting is the first time the committee discussed the logistics of implementing this model.  I brought up the example of MCB, our largest department.  MCB has 475 FSE under the model, accounting for possibly 1000 people; the department would probably want to allocate network charges to 100 or more chart strings.  Imagine a scenario where the department manager gets a list of 800 people and has to manually specify the chart string (or multiple chart strings) for each one of them; it sounds ugly.  One possibility for simplifying the logistics would be to implement a GAEL-style payroll charge; the network would be charged to the same chart string as the individual’s payroll.  For some departments this would work well; for others it would still be a major headache.  We’re very early in the process of deciding how the model will be implemented; please let me know any comments you have, and I will continue to keep you informed.

Posts and comments on this blog are the opinions of their authors, and do not necessarily represent the opinions of LSCR, the College of Letters & Science, or the University.