Glossary: Tags

.Net
The Microsoft .NET Framework is a software component that can be added to or is included with the Microsoft Windows operating system. It provides a large body of pre-coded solutions to common program requirements, and manages the execution of programs written specifically for the framework. The .NET Framework is a key Microsoft offering, and is intended to be used by most new applications created for the Windows platform. http://en.wikipedia.org/wiki/.NET_Framework
Activ8
activeMQ
activerecord
admin
administration
administrator
advertise
AJAX
Ajax, or AJAX, is a web development technique used for creating interactive web applications. The intent is to make web pages feel more responsive by exchanging small amounts of data with the server behind the scenes, so that the entire web page does not have to be reloaded each time the user requests a change. This is intended to increase the web page's interactivity, speed, functionality, and usability. http://en.wikipedia.org/wiki/Ajax_(programming)
algorithm
amazon
Amazon S3
amazon s3 image hosting
analytics large database real time
announcement
answers
Apache
The Apache HTTP Server Project is a collaborative software development effort aimed at creating a robust, commercial-grade, featureful, and freely-available source code implementation of an HTTP (Web) server. The project is jointly managed by a group of volunteers located around the world, using the Internet and the Web to communicate, plan, and develop the server and its related documentation. Apache is the most popular web server in use today because it is free, runs everywhere, performs well, and can be configured to handle most needs. http://httpd.apache.org/
apc
API
AppEngine
application
architecture partitioning scaling
ask
asp
asp.net
async
Fire-and-forget information exchange. Participants in an asynchronous messaging system don't have to wait for a response from the recipient, because they can rely on the messaging infrastructure to ensure delivery. This is a vital ingredient in loosely coupled systems such as web services, because it allows participants to communicate reliably even if one of the parties is temporarily offline, busy, or unobtainable. Asynchronous messaging systems are also vastly more scalable than those that rely on direct connections, such as remote procedure calls (RPCs). http://looselycoupled.com/glossary/asynchronous%20messaging
Asynchronous
asynchronous architecture
automation
automattic
AWS
AWStats
A log analyzer program. http://highscalability.com/awstats
babes
Backup
Bandwidth
BCP
Business Continuity Planning. Create a plan for how an organization will recover and restore partially or completely interrupted critical function(s) within a predetermined time after a disaster or extended disruption. For web sites a big part of this is figuring out how to run in multiple data centers.
berkeleydb btree
best platform
BigTable
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this paper we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable. http://labs.google.com/papers/bigtable.html
bikers
bikes
Blog
blue coat
Book
A book related in some way to making high scalability websites.
booze.
C
C++
Caching
Store result of a computation or I/O for quicker future access. Can cache locally, in remote memory, in the database, etc.
Cacti
Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices. http://www.cacti.net/
CakePHP
capacity
Increased growth (usage) means needing capacity. SCALABILITY (horizontal or vertical) = ability to easily add capacity to accommodate growth. Capacity doesn’t mean speed. Planning includes realizing what you have right NOW, and predicting what you’ll need later. Planning (what ?/why ?/when ?)
Capistrano
CARP
The Common Address Redundancy Protocol or CARP is a protocol which allows multiple hosts on the same local network to share a set of IP addresses. Its primary purpose is to provide failover redundancy. For example, if there is a single computer running a packet filter, and it goes down, then either the networks on either side of the packet filter can no longer communicate with each other, or they communicate without any packet filtering. If, however, there are two computers running a packet filter, running CARP, then if one fails, the other will take over, and computers on either side of the packet filter will not be aware of the failure, so operation will continue as normal. In order to make sure the new master operates the same as the old one, pfsyncd is used. In some configurations CARP can also provide load balancing functionality. http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol http://www.zampanosbits.com/carpdns.html - Redundant/Balanced Proxy Cache with CARP/DNS/Squid
Carrier Hotel
A carrier hotel, also called a colocation center, is a secure physical site or building where data communications media converge and are interconnected. It is common for numerous service providers to share the facilities of a single carrier hotel. This minimizes overhead and optimizes communications efficiency for all participants as long as the infrastructure is sufficient to handle all the data at times of peak demand. A carrier hotel is a sizable facility, often containing more than 5000 square meters (approximately 54,000 square feet) of floor space. Businesses that benefit from the use of carrier hotels include Web site hosting companies, storage service providers and telecommunications companies. http://searchstorage.techtarget.com/sDefinition/0,290660,sid5_gci1229533,00.html
CDN
CDN stands for content delivery network. CDN is a system of computers networked together across the Internet that cooperate transparently to deliver content (especially large media content) to end users. The first web content based CDN's were Sandpiper and Skycache followed by Akamai and Digital Island. The first video based CDN was iBEAM Broadcasting. CDN nodes are deployed in multiple locations, often over multiple backbones. These nodes cooperate with each other to satisfy requests for content by end users, transparently moving content behind the scenes to optimize the delivery process. Optimization can take the form of reducing bandwidth costs, improving end-user performance, or both. The number of nodes and servers making up a CDN varies, depending on the architecture, some reaching thousands of nodes with tens of thousands of servers. http://en.wikipedia.org/wiki/Content_Delivery_Network
cell phone
CentOS
China CDN Asia scaling cache
cloud
cloud computing
Clusering
cluster
Cluster File System
A single file system from a distributed set of disks and create multiple IO paths by striping across the network. Incredible speed, bandwidth, manageability, and high availability is possible using clustered file systems.
Clustered Storage System
Clustering
clusters
CMS
Colocation
A colocation centre (collocation center) ("colo") or carrier hotel is a type of data center where multiple customers locate network, server and storage gear and interconnect to a variety of telecommunications and other network service provider(s) with a minimum of cost and complexity. http://en.wikipedia.org/wiki/Colocation
Comet
comet ajax Asynchronous non blocking HTTP Java Servlet
consistent hashing
cookie-based session storage
CPM
Cost per thousand impressions. The "M" means thousand, not million. The amount an advertiser will pay for each 1000 ads shown. Some ad campaigns are sold by CPM, while others are priced by click or another measure. Effective CPM is a useful measure of a campaign's profitability. The total price paid in a CPM deal is calculated by multiplying the CPM rate by the number of CPM units. For example, one million impressions at $10 CPM equals a $10,000 total price. 1,000,000 / 1,000 = 1,000 units 1,000 units X $10 CPM = $10,000 total price The amount paid per impression is calculated by dividing the CPM by 1000. For example, a $10 CPM equals $.01 per impression. $10 CPM / 1000 impressions = $.01 per impression http://www.marketingterms.com/dictionary/cpm/
CSP
CommunicatingSequentialProcesses is a concurrency model invented by Tony Hoare. He wrote a good book about it by the same name. The book is now available online from http://www.usingcsp.com/. Communicating Sequential Processes, or CSP, is a language for describing patterns of interaction. It is supported by an elegant, mathematical theory, a set of proof tools, and an extensive literature. The book Communicating Sequential Processes was first published in 1985 by Prentice Hall International (who have kindly released the copyright); it is an excellent introduction to the language, and also to the mathematical theory. http://www.possibility.com/epowiki/Wiki.jsp?page=CommunicatingSequentialProcesses
CTR
Click through rate. Click-through rate or CTR is a way of measuring the success of an online advertising campaign. A CTR is obtained by dividing the number of users who clicked on an ad on a web page by the number of times the ad was delivered (impressions). For example, if your banner ad was delivered 100 times (impressions delivered) and 1 person clicked on it (clicks recorded), then the resulting CTR would be 1%. http://en.wikipedia.org/wiki/Click-through_rate
CV
CVSup
CVSup is a software package for distributing and updating collections of files across a network. It can efficiently and accurately mirror all types of files, including sources, binaries, hard links, symbolic links, and even device nodes. CVSup's streaming communication protocol and multithreaded architecture make it most likely the fastest mirroring tool in existence today. http://www.cvsup.org/
Data Center
A facility used to house mission critical computer systems and associated components. http://www.possibility.com/epowiki/Wiki.jsp?page=DatacenterSystemChoiceAnalysis
Data Parallel Algorithms
Parallel computers with tens of thousands of processors are typically programmed in a data parallel style, as opposed to the control parallel style used in multiprocessing. The success of data parallel algorithms-even on problems that at first glance seem inherently serial-suggests that this style of programming has much wider applicability than was previously thought. http://www.possibility.com/epowiki/Wiki.jsp?page=DataParallelAlgorithms
data-grid
Database
In computing, a database can be defined as a structured collection of records or data that is stored in a computer so that a program can consult it to answer queries. The records retrieved in answer to queries become information that can be used to make decisions. The computer program used to manage and query a database is known as a database management system (DBMS). The properties and design of database systems are included in the study of information science.
database replication
databases
DBI
The DBI is the standard database interface module for Perl. It defines a set of methods, variables and conventions that provide a consistent database interface independent of the actual database being used. http://dbi.perl.org/
debian
deployment
Deployment is installing, configuring, and managing your website. Deployment includes making sure you can deploy new capacity easily.
design
DHT
Diagonal Scaling
digramatic representation of scalable web
distributed algorithm
distributed file system
distributed systems
Django
dns
DRBD
DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1 http://www.drbd.org/
drupal
eAccelerator
eAccelerator is a free open-source PHP accelerator, optimizer, and dynamic content cache. It increases the performance of PHP scripts by caching them in their compiled state, so that the overhead of compiling is almost completely eliminated. http://highscalability.com/eAccelerator
EC2
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers. http://aws.amazon.com/
education
ejb
ejb2
ejb3
email
Junk mail with occasionally useful content.
enterprise java beans
epoll
epoll is a variant of poll(2) that can be used either as Edge or Level Triggered interface and scales well to large numbers of watched fds. Three system calls are provided to set up and control an epoll set: epoll_create(2), epoll_ctl(2), epoll_wait(2). http://linux.die.net/man/4/epoll
erlang
Event
An event is an occurrence or happening of significance to a task or program, such as the completion of an asynchronous input/output operation. A task may wait for an event or any of a set of events or it may (request to) receive asynchronous notification (a signal or interrupt) that the event has occurred. Event-driven programming or event-based programming is a computer programming paradigm in which the flow of the program is determined by user actions (mouse clicks, key presses) or messages from other programs. In contrast, in batch programming or flow-driven programming the flow is determined by the programmer. Batch programming is the style taught in beginning programming classes while event-driven programming is what is needed in any interactive program. Event-driven programs can be written in any language, although the task is easier in some languages than in others. http://en.wikipedia.org/wiki/Event-driven_programming
Example
An example website architecture that we can learn from.
F5 WA
facebook
FAI
failure
Failure Analysis
Examples or reasons why websites fail.
fast
Fault-tolerance
Fault-tolerance or graceful degradation is the property that enables a system (often computer-based) to continue operating properly in the event of the failure of some of its components. If its operating quality decreases at all, the decrease is proportional to the severity of the failure, as compared to a naively-designed system in which even a small failure can cause total breakdown. Fault-tolerance is particularly sought-after in high-availability or life-critical systems. http://en.wikipedia.org/wiki/Fault-tolerance
federated database
file server
file system
file upload
flickr
framework
freebsd
Friendster
FTP Redundancy Archive Consolidation
funny
fuse
future
Future Event
An interesting event like a web cast or meeting that will be happening in the future.
GAE
Ganglia
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. Uses multicast and/or unicast to inject xml data into an rrdtool frontend. Makes it very easy to make custom graphs as it was originally written to handle stats data from HPC clusters http://ganglia.sourceforge.net/
Geo-distributed Clusters
Architecting your website to run in more than one data center.
GFS
GFS GNBD shared block device
gigaspaces
gnbd
google
google add urls
google guide
google info
Google Stack
The stack of software and hardware on which Google developers their applications. http://highscalability.com/google-architecture
google tips
google tricks
graceful
gravatar
Grid
Grid computing is a phrase in distributed computing which can have several meanings: * A local computer cluster which is like a "grid" because it is composed of multiple nodes. * Offering online computation or storage as a metered commercial service, known as utility computing, "computing on demand", or "cloud computing". * The creation of a "virtual supercomputer" by using spare computing resources within an organization. * The creation of a "virtual supercomputer" by using a network of geographically dispersed computers. Volunteer computing, which generally focuses on scientific, mathematical, and academic problems, is the most common application of this technology http://en.wikipedia.org/wiki/Grid_computing
GWT
Google Web Toolkit - Build AJAX apps in the Java language For more information please see: http://www.possibility.com/epowiki/Wiki.jsp?page=GWT
HA
Availability, it the simplest sense, describes the proportion of time that a system is available for use. It is a system measurement, and therefore is indifferent to the source of faults and failures. Software and hardware faults, and even operational errors can figure into a system's availability. Availability, can be defined as MTTF / (MTTF + MTTR). where: MTTF is mean time to failure MTTRis mean time to repair. For more information please see: http://www.possibility.com/epowiki/Wiki.jsp?page=HighAvailability
Haddoop
haddop
Hadoop
Hadoop is a framework for running applications on large clusters of commodity hardware. Hadoop implements a computational paradigm named map/reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster.

  1. More on Hadoop
haproxy
heavy weight
hello
help
hibernate
high availability
high availablilty
high-scalability
Horizontal Scaling
Scaling up by incrementally adding more resources to handle more work. The new era of cheap yet powerful computers has made horizontal scaling possible for virtually anyone. Many companies can afford to keep grid of hundreds of machines to solve problems. This is the approach google has taken to handle their search systems, for example, and it's a very different approach from a fixed resource approach. In a fixed resource approach we would be squeezing every cycle of performance of the resources, we would be spending a lot of time on developing new approaches and tuning existing code to fit the exact problem. When resources are available, and your approach is right, you can just add more machines. You start to figure out ways to solve your problem assuming horizontal scaling.
hosting
hp
ibatis
ID generator
IIS
Microsoft Internet Information Services (IIS; formerly called Server) is a set of Internet-based services for servers using Microsoft Windows. It is the world's second most popular web server in terms of overall websites. As of May 2007 it served 31% of all websites according to Netcraft.[1] The servers currently include FTP, SMTP, NNTP and HTTP/HTTPS. http://en.wikipedia.org/wiki/Internet_Information_Services
Image Server
A service that provides storage for images and high bandwidth access. It's different than a CDN in that they may not be distributed across data centers.
image solution
imap
inidrmeden izle
installation
instant-messaging
Internet TV
isp
ispman
j2EE
Java
Java is a programming language originally developed by Sun Microsystems and released in 1995. Java applications are typically compiled to bytecode, although compilation to native machine code is also possible. At runtime, bytecode is usually either interpreted or compiled to native code for execution, although direct hardware execution of bytecode by a Java processor is also possible. Java is very popular on the server side because it is free, relatively high performing. has a large number of useful libraries, and great development tools. Websites build using Java generally use application servers and are accessed using servelets. http://www.java.com/en/
Java server socket nio Scalable nonblocking
Java Servlet
JavaScript
JDBC
job queue
jobs
Joost
joost network architecture p2p tv
LAMP
LAMP is a popular open-source technology stack on which many websites are built. All the letters mean: * Linux, referring to the operating system; * Apache, the Web server; * MySQL, the database management system (or database server); * PHP, the programming language. The reason LAMP is popular is because all the components are free, in the open-source sense. This means you can horizontally scale your system at a much lower incremental cost as demand increases. To a large extent LAMP is more an idea than specific set of technologies. It's much like AJAX in the way. Replacing each letter with a different technology doesn't change the spirit of the acronym. To build a website you need an OS (linux), you need a webserver (apache), you need a database (MySQL), and you need a client technology (PHP). Replace Linux with Windows and you lose some cost flexibility, but you still can build your website. Replace MySQL with Postgress and you can still build your website. The advantage of the LAMP stack is that is there is a lot of expertise and help when using it and all the parts of evolved to work better together. http://en.wikipedia.org/wiki/LAMP_(software_bundle)
LAMP PHP Linux MySQL Apache developer
language
Latency
Network Latency * The time it takes for a packet to cross a network connection, from sender to receiver. * The period of time that a frame is held by a network device before it is forwarded. Two of the most important parameters of a communications channel are its latency, which should be low, and its bandwidth, which should be high. Latency is particularly important for a synchronous protocol where each packet must be acknowledged before the next can be transmitted. OS Latency Let T be a task belonging to a time-sensitive application that requires execution at time t, and let t' be the time at which T is actually scheduled. OS latency as experienced by T as L= t' - t. http://www.possibility.com/epowiki/Wiki.jsp?page=ItsTheLatencyStupid
Layeredtech
A provider of next generation, self-managed utility computing and hosting solutions. They off self-managed servers, colocation, and grid-computing. http://www.layeredtech.com/
lbpool
Light Weight Web Server
A server that advocates consider more efficient than heavier weight servers. Now, isn't that helpful?
lighttpd
lighttpd (pronounced "lighty") is a web server which is designed to be secure, fast, standards-compliant, and flexible while being optimized for speed-critical environments. http://highscalability.com/lighttpd
lightweight
Linkedin
Linux
Linux is a free Unix-type operating system originally created by Linus Torvalds with the assistance of developers around the world. Linux is a very popular OS in data centers because it is free, runs on a lot of hardware, has tons of available software, highly performing, easily virtualizable, and flexible. All good attributes when you are starting a web site and hoping to grow with demand. Some popular versions of Linux used in data centers are: CentOS, Red Hat, and Ubuntu. http://www.linux.org/
LiveJournal
http://www.livejournal.com/ LiveJournal lets you express yourself, share your life, and connect with friends online. You can use LiveJournal in many different ways: as a private journal, a blog, a discussion forum, a social network, and more.
load
load balance
load balancer
Load Balancing
Load Balancing is a scalability solution where work is allocated amongst multiple servers. * DNS servers can allocate requests amongst multiple machines * Web servers can allocate requests amongst machines and processes * Hardware routers can allocate requests amongst multiple machines * google uses this strategy for their amazing performance There is often a large infrastructure in place to replicate state so that requests in the same session can access state from any server. Clearly mostly read only applications can make the best use of load balancing because the write update consistency problems are not present. Often L4-L7 swtiches, like the netscaler, are used to load balance servers at line rate. Or you can load balance without the assist of hardware using Hash Based Node Selection. For more information please see: http://www.possibility.com/epowiki/Wiki.jsp?page=LoadBalancing
load-balancing
Log Analyzer
Web log analysis software (also called a web log analyzer) is software that parses a log file from a web server (like Apache), and based on the values contained in the log file, derives indicators about who, when and how a web server is visited. http://en.wikipedia.org/wiki/Web_log_analysis_software
Log Structured File System
The design of log-structured file systems is based on the hypothesis that this will no longer be effective because ever-increasing memory sizes on modern computers would lead to I/O becoming write-heavy because reads would be almost always satisfied from memory cache. A log-structured file system thus treats its storage as a circular log and writes sequentially to the head of the log. This maximizes write throughput on magnetic media by avoiding costly seeks. http://en.wikipedia.org/wiki/Log-structured_file_system
logging
look-mom-no-firewalls
low-latency
Latency is the amount of time a message takes to traverse a system. In a computer network, it is an expression of how much time it takes for a packet of data to get from one designated point to another. It is sometimes measured as the time required for a packet to be returned to its sender. Latency depends on the speed of the transmission medium (e.g., copper wire, optical fiber or radio waves) and the delays in the transmission by devices along the way (e.g., routers and modems). A low latency indicates a high network efficiency. http://www.bellevuelinux.org/latency.html
Lucene
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. http://lucene.apache.org/java/docs/index.html
LVS
The Linux Virtual Server is a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system. The architecture of the server cluster is fully transparent to end users, and the users interact as if it were a single high-performance virtual server. The Linux Virtual Server as an advanced load balancing solution can be used to build highly scalable and highly available network services, such as scalable web, cache, mail, ftp, media and VoIP services. http://www.linuxvirtualserver.org/
management
Map Reduce
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. http://labs.google.com/papers/mapreduce.html
Marcel
Master
In single-master replication, the master server writes updates to its binary log files and maintains an index of those files to keep track of log rotation. The binary log files serve as a record of updates to be sent to any slave servers. When a slave connects to its master, it informs the master of the position up to which the slave read the logs at its last successful update. The slave receives any updates that have taken place since that time, and then blocks and waits for the master to notify it of new updates.
mcluster
Memcached
memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. Danga Interactive developed memcached to enhance the speed of LiveJournal.com, a site which was already doing 20 million+ dynamic page views per day for 1 million users with a bunch of webservers and a bunch of database servers. memcached dropped the database load to almost nothing, yielding faster page load times for users, better resource utilization, and faster access to the databases on a memcache miss. Memcached is very popular and is used in many websites. http://www.danga.com/memcached/
Memcached syncronization
memoization
In computing, memoization is an optimization technique used primarily to speed up computer programs by storing the results of function calls for later reuse, rather than recomputing them at each invocation of the function. Memoization has also been used in other contexts (and for other purposes than speed gains), such as in parsing. Although related to caching, memoization refers to a specific case of this optimization, distinguishing it from forms of caching such as buffering or page replacement. http://en.wikipedia.org/wiki/Memoization
Messaging
Message-oriented middleware comprises a category of inter-application communication software that generally relies on asynchronous message-passing as opposed to a request/response metaphor. Most message-oriented middleware (MOM) depends on a message queue system, although some implementations rely on broadcast or on multicast messaging systems. http://en.wikipedia.org/wiki/Message_Oriented_Middleware
Microsoft
mirror
mod_perl
mod_perl gives you a persistent Perl interpreter embedded in your web server. This lets you avoid the overhead of starting an external interpreter and avoids the penalty of Perl start-up time, giving you super-fast dynamic content. http://perl.apache.org/
mod_proxy
mod_proxy_balancer
MogileFS
MogileFS is an open source distributed filesystem. Its properties and features include: Application level, No single point of failure, Automatic file replication, Better than RAID, Flat Namespace, Shared-Nothing, No RAID required, Local filesystem agnostic. http://www.danga.com/mogilefs/
Mongrel
Mongrel is a fast HTTP library and server for Ruby that is intended for hosting Ruby web applications of any kind using plain HTTP rather than FastCGI or SCGI. It is framework agnostic and already supports Ruby On Rails, Og+Nitro, Camping, and IOWA frameworks. http://mongrel.rubyforge.org/
Monitoring
Checking the health of application, networks, machines, and any other critical resources.
mono
MTBF
MTBF is the mean-time-between-failures. That means that the system of drives will, on average, have a certain period of time between failures. That number can be far lower than MTTF. MTTF indicates life, but MTBF doesn’t. MTTF generally will not vary with time, but MTBF does. Also, MTTF doesn’t vary with rates of installation or replacement, yet MTBF will.
MTTF
MTTF is the mean-time-to-failure. That means that each drive will, on average, last a certain amount of time. In this case, each drive will last, on average, 1,000,000 hours. That means some will die sooner, some later, etc.
multi
multilanguage
multisourcing
Munin
Munin the monitoring tool surveys all your computers and remembers what it saw. http://highscalability.com/product-munin-monitoriting-tool
MySQL
The world's most popular open source database. A common database choice for websites. http://www.mysql.com/
mysql cluster
mysql memcached java ha brdb replication clustering proxy
MySQL Proxy
mysql search caching architecture
mysql-proxy
MySQL. Shard
Nagios
Nagios is an Open Source host, service and network monitoring program. It used by sites like FeedBurner and Yahoo to monitor their data center. It is very flexible, but takes a lot of work to customize to your needs. http://www.nagios.org/
ndb
NetCache
Network Architecture
network engineering
network switch
network topology
NFS failover high availability ucarp
nginx
nodes
non blocking HTTP
nudefemales
Olympic Site Architecture
on demand
open source
OpenSocial
openspaces
operations
Oprah
Oracle
A powerful high end RDBMS. http://www.oracle.com/
orm
OSS
outlaws
overflow
P2P
A peer-to-peer (or "P2P") computer network exploits diverse connectivity between participants in a network and the cumulative bandwidth of network participants rather than conventional centralized resources where a relatively low number of servers provide the core value to a service or application http://en.wikipedia.org/wiki/Peer-to-peer
Paper
A category to link all useful papers together.
partial string matching
perdition
Either the abode of Satan or a fully featured POP3 and IMAP4 proxy server. http://highscalability.com/product-perdition-mail-retrieval-proxy
Performance
A category for performance related tools and topics.
performance monitor
Perl
Perl is a dynamic programming language created by Larry Wall and first released in 1987. Perl borrows features from a variety of other languages including C, shell scripting (sh), AWK, sed and Lisp. It's powerful text process features and amazing library support made it an early favorite language choice for website designers. Other languages like PHP, Java, and .Net have since become more popular, but it still is favorite of many. http://www.perl.com/
Perlbal
Perlbal is our Perl-based reverse proxy load balancer and web server. It processes hundreds of millions of requests a day just for LiveJournal, Vox and TypePad and dozens of other "Web 2.0" applications. Perlbal is a single-threaded event-based server supporting HTTP load balancing, web serving, and a mix of the two. http://www.danga.com/perlbal/
PHP
PHP is a reflective programming language originally designed for producing dynamic web pages. PHP is used mainly in server-side scripting, but can be used from a command line interface or in standalone graphical applications. PHP is popular because it's free, relatively easy to program in, and has a lot of features for producing websites quickly. http://www.php.net/
Pingdom
PL/Proxy
PL/SQL
platform
podcast
poll
pop
pop3
Postgres
PostgreSQL is a powerful, open source relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness. It runs on all major operating systems, including Linux, UNIX (AIX, BSD, HP-UX, SGI IRIX, Mac OS X, Solaris, Tru64), and Windows. It is fully ACID compliant, has full support for foreign keys, joins, views, triggers, and stored procedures (in multiple languages). It includes most SQL92 and SQL99 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR, VARCHAR, DATE, INTERVAL, and TIMESTAMP. It also supports storage of binary large objects, including pictures, sounds, or video. It has native programming interfaces for C/C++, Java, .Net, Perl, Python, Ruby, Tcl, ODBC, among others, and exceptional documentation. http://www.postgresql.org/
postgresql
pound
PowerDNS
The PowerDNS Nameserver is a modern, advanced and high performance authoritative-only nameserver. It is written from scratch and conforms to all relevant DNS standards documents. Furthermore, PowerDNS interfaces with almost any database. http://www.powerdns.com/
Problems
A category describing different scalability problems web builders may encounter.
Product
profiling
propel
proxy
push
Python
Qcon
questions
RAID
In computing, specifically computer storage, a Redundant Array of Independent Drives (or Disks), also known as Redundant Array of Inexpensive Drives (or Disks), (RAID) is an umbrella term for data storage schemes that divide and/or replicate data among multiple hard drives. RAID can be designed to provide increased data reliability or increased I/O performance, though one goal may compromise the other. http://en.wikipedia.org/wiki/RAID
rails
replication
Reverse Proxy
A web proxy caches the contents of an unlimited number of webservers for a limited number of clients—is the classical set-up. Another set-up is "reverse proxy" or "webserver acceleration" (using httpd_accel_host). In this set-up, the cache serves an unlimited number of clients for a limited number of—or just one—web servers. As an example, if slow.example.com is a "real" web server, and www.example.com is the Squid cache server that "accelerates" it, the first time any page is requested from www.example.com, the cache server would get the actual page from slow.example.com, but later requests would get the stored copy directly from the accelerator (for a configurable period, after which the stored copy would be discarded). The end result, without any action by the clients, is less traffic to the source server, meaning less CPU and memory usage, and less need for bandwidth. This does, however, mean that the source server cannot accurately report on its traffic numbers. http://en.wikipedia.org/wiki/Squid_cache
RoR
Rails is a full-stack framework for developing database-backed web applications according to the Model-View-Control pattern. From the Ajax in the view, to the request and response in the controller, to the domain model wrapping the database, Rails gives you a pure-Ruby development environment. To go live, all you need to add is a database and a web server. http://rubyonrails.org/
S2
S3
Amazon S3 is storage for the Internet. It is designed to make web-scale computing easier for developers. http://aws.amazon.com/
SAAS
Software as a service (SaaS) is a software application delivery model where a software vendor develops a web-native software application and hosts and operates (either independently or through a third-party) the application for use by its customers over the Internet. Customers pay not for owning the software itself but for using it. They use it through an API accessible over the Web and often written using Web Services or REST. The term SaaS has become the industry preferred term, generally replacing the earlier terms Application Service Provider (ASP), On-Demand and "Utility computing". http://en.wikipedia.org/wiki/Software_as_a_Service
sampling
SAN
A storage area network (SAN) is an architecture to attach remote computer storage devices such as disk arrays, tape libraries and optical jukeboxes to servers in such a way that, to the operating system, the devices appear as locally attached devices. By contrast to a SAN, network-attached storage (NAS) uses file-based protocols such as NFS or SMB/CIFS where it is clear that the storage is remote, and computers request a portion of an abstract file rather than a disk block. http://en.wikipedia.org/wiki/Storage_area_network
Scalability
Scalability is the ability to keep solving a problem as the size of the problem increases. Scale is measured relative to your requirements. As long as you can scale enough to solve your problem then you have scale. If you can handle the number of objects and events required for your application then you can scale. It doesn't really matter what the numbers are. Scaling often creates a difference in kind for potential solutions. The solution you need to handle a small problem is not the same as you need to handle a large problem. If you incrementally try to evolve one into the other you can be in for a rude surprise, because it won't work as you pass through different points of discontinuity. Scale is not language or framework specific. It is a matter of approach and design. http://www.possibility.com/epowiki/Wiki.jsp?page=Scalability
Scalability User Group
scalable
scale-out
scaling
scaling storage
scheduler
scripting languages
search
search scalability
session management
sessions
Shard
A shard architecture partitions data on to multiple servers so each server holds a shard of the data. It's a federated model. You can partition data by user, by type (photo, messages, etc), or a combination. Some advantages are: * faster backup * faster recovery * data can fit into memory * data is easier to manage * provided more write bandwidth because you aren't writing to a single master. In a single master architecture write bandwidth is throttled. This technique is used by many large websites, including eBay, Yahoo, LiveJournal, and Flickr.
sharding
shards
SimpleDB
singshot architecture
SLA
An SLA is a formal negotiated agreement between two parties. It is a contract that exists between customers and their service provider, or between service providers. It records the common understanding about services, priorities, responsibilities, guarantee, etc. with the main purpose to agree on the level of service. For example, it may specify the levels of availability, serviceability, performance, operation or other attributes of the service like billing and even penalties in the case of violation of the SLA. http://en.wikipedia.org/wiki/Service_Level_Agreement
Slave
A database server synced to a master server. Used to spread out read requests amongst multiples slaves.
smtp
social networking
solaris
sourcecode
space-based architecture
space-based-architecture
SPOF
Single Point of Failure. Achieving high reliability requires removing all single points of failure, usually through redundancy and failover schemes. http://en.wikipedia.org/wiki/Single_point_of_failure
Spring
SQL Server
Microsoft SQL Server is a relational database management system (RDBMS) produced by Microsoft. Its primary query language is Transact-SQL, an implementation of the ANSI/ISO standard Structured Query Language (SQL) used by both Microsoft and Sybase. http://en.wikipedia.org/wiki/Microsoft_SQL_Server
sqlrelay
SQS
Amazon Simple Queue Service (Amazon SQS) offers a reliable, highly scalable hosted queue for storing messages as they travel between computers. By using Amazon SQS, developers can simply move data between distributed application components performing different tasks, without losing messages or requiring each component to be always available. http://aws.amazon.com/
Squid
Squid is a proxy server, reverse proxy server, and web cache daemon. It has a wide variety of uses, from speeding up a web server by caching repeated requests, to caching web, DNS and other computer network lookups for a group of people sharing network resources, to aiding security by filtering traffic. Squid is used my many websites as part of their scaling architecture. http://en.wikipedia.org/wiki/Squid_cache
SSD
SSDS
SSH
Starling
startup
Statistics
storage
Storage Virtualization
Speed and scalability in the storage system can be had using clustered file systems. These create a single file system from a distributed set of disks and create multiple IO paths by striping accross the network. Storage virtualization refers to the process of abstracting logical storage from physical storage. The term is today used to describe this abstraction at any layer in the storage software and hardware stack. Helps get around one of the major problems of scaling a website, which is cleanly adding more disk space. http://en.wikipedia.org/wiki/Storage_virtualization
story
Strategy
A strategy is something you can do, that is often quite simple, to help improve your website.
streaming
Subcon
Subcon allows you to store your essential system configuration files in a subversion repository and easily deploy different configurations to machines in a cluster. It also features optional integration with SystemImager, enabling the deployment of system images and configuration in a single step. A flexible configuration file provides the ability to start, stop, or restart services or run arbitrary scripts when a change in a file or set of files is detected. http://code.google.com/p/subcon/
subversion
Sun
Sun has one of the best OSes on the market and large selection of servers, storage, and other products that are popular in data centers. http://www.sun.com/
Symfony
symfony is an open-source PHP web framework http://highscalability.com/symfony
sync
SystemImager
SystemImager is software which automates Linux installs, software distribution, and production deployment. SystemImager is a part of System Installation Suite. SystemImager makes it easy to do automated installs (clones), software distribution, content or data distribution, configuration changes, and operating system updates to your network of Linux machines. You can even update from one Linux release version to another! It can also be used to ensure safe production deployments. By saving your current production image before updating to your new production image, you have a highly reliable contingency mechanism. If the new production enviroment is found to be flawed, simply roll-back to the last production image with a simple update command! Some typical environments include: Internet server farms, database server farms, high performance clusters, computer labs, and corporate desktop environments. http://systemimager.org
tag cloud
tags
tech
Television
template
test
TheSchwartz
An open source, scalable job processing system that both operations and engineers will love. From http://brad.livejournal.com/2250519.html: TheSchwartz is fire and forget. Higher latency (seconds or more), but it'll get done, according to the rules you put down for it. (regarding backoff policies and retries...) like sending an email. who cares if it takes 5 seconds, but *** if you're gonna sit around and wait for it. you just want to be confident it'll work. and you're given a handle so you can check on its return code (or current errors or error history) later, if policy for the job you've submitted is set to retain that exit status. http://code.sixapart.com/trac/TheSchwartz
thread safe
tools and Utilities
traffic
undefined
unicluster
utility computing
varnish
Vertical Scaling
Scaling by adding more CPUs within the same computer system.
video
virtual LAN
virtualisation
Virtualization
Virtualization is an abstraction layer that decouples the physical hardware from the operating system to deliver greater IT resource utilization and flexibility. Virtualization allows multiple virtual machines, with heterogeneous operating systems to run in isolation, side-by-side on the same physical machine. Each virtual machine has its own set of virtual hardware (e.g., RAM, CPU, NIC, etc.) upon which an operating system and applications are loaded. The operating system sees a consistent, normalized set of hardware regardless of the actual physical hardware components. http://www.vmware.com/virtualization/
VPS
A virtual private server (also referred to as VPS or virtual server, and abbreviated VPS or VDS) is a method of partitioning a physical server computer into multiple servers that each has the appearance and capabilities of running on its own dedicated machine. Each virtual server can run its own full-fledged operating system, and each server can be independently rebooted. The practice of partitioning a single server so that it appears as multiple servers has long been common practice in mainframe computers, but has seen a resurgence lately with the development of virtualization software and technologies for other architectures. http://en.wikipedia.org/wiki/Virtual_private_server
vrius scan
WAMP
WAMP is like LAMP except using Windows instead of Linux.
web
web 2.0
Web Analytics
Web analytics is the study of the behaviour of website visitors. In a commercial context, web analytics especially refers to the use of data collected from a web site to determine which aspects of the website work towards the business objectives; for example, which landing pages encourage people to make a purchase. Data collected almost always includes web traffic reports. It may also include e-mail response rates, direct mail campaign data, sales and lead information, user performance data such as click heat mapping, or other custom metrics as needed. This data is typically compared against key performance indicators for performance, and used to improve a web site or marketing campaign's audience response. http://en.wikipedia.org/wiki/Web_analytics
Web Framework
A web application framework is a software framework that is designed to support the development of dynamic websites, Web applications and Web services. The framework aims to alleviate the overhead associated with common activities used in Web development. For example, many frameworks provide libraries for database access, templating frameworks and session management, and often promote code reuse. http://en.wikipedia.org/wiki/Web_application_framework
WEb hosting
Web Server
The term Web server can mean one of two things: 1. A computer program that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them HTTP responses along with optional data contents, which usually are Web pages such as HTML documents and linked objects (images, etc.). 2. A computer that runs a computer program which provides the functionality described in the first sense of the term. http://en.wikipedia.org/wiki/Web_server
web site
Webcast
A webcast is a live media file distributed over the Internet using streaming media technology. Essentially, webcasting is broadcasting over the Internet. http://en.wikipedia.org/wiki/Webcast
website
widget
Windows
wordpress
xen
xmpp
xsl mp3.com perl xml mod_perl apache
yahoo
youtube video izle
Youtube Yutub Yutube youtobe yuotube yutap yutup yutobe redtube youtube x videoları videos
yslow
教育技术