数据库
首页 > 数据库> > 课程研讨|数据库原理1|第二周-4

课程研讨|数据库原理1|第二周-4

作者:互联网

研讨题目

什么是关系型数据库?什么是非关系型数据库?它们有什么区别?各举1个典型产品简单介绍他们特点?

研讨内容

关系型数据库

关系型数据库是建立在关系模型基础上的资料库。

What is a Relational Database?

Relational databases maintain data in tables, providing an efficient, intuitive, and flexible way to store and access structured information. Tables, also known as relations, consist of columns containing one or more data categories, and rows, also known as table records, containing a set of data defined by the category. Applications access data by specifying queries, which use operations such as project to identify attributes, select to identify tuples, and join to combine relations. The relational model for database management was developed by IBM computer scientist Edgar F. Codd in 1970.

关系数据库将数据保存在表中,从而提供了一种有效、直观且灵活的方式来存储和访问结构化信息。表,也称为关系,由包含一个或多个数据类别的列和包含表项定义的一组数据的行(也称为表记录)组成。应用程序通过指定查询来访问数据,查询使用诸如项目之类的操作来标识属性,选择来标识元组并联结以合并关系。

How do Relational Databases Work?

Relational databases provide an environment from which data can be accessed or reassembled in a variety of different ways without needing to reorganize the database tables. Each table has a unique identifier, or primary key, which identifies the information in the table, and each row contains a unique instance of data for the categories defined by the columns. For instance, the table might have a primary key of ‘First Names’ and rows with specific examples such as ‘John, Paul, George and Ringo.’

关系数据库提供了一种可以以各种不同的方式访问或重组数据,而无需重新组织数据库表的环境。每个表只都有一个唯一的标识符或主键,用于标识表中的信息,并且每一行都包含由列定义的类别的数据的唯一实例。

The logical connection between different tables can then be established with the use of foreign keys - a field in a table that connects to the primary key data of another table. Relational Database Management Systems often employ SQL or structured query language for gathering data for reports and for interactive queries. So in our example, First Names might be linked to a Role table with data roles of Lead Vocals, Bass Guitar, Drums and Lead Guitar.

可以使用外键建立不同表之间的逻辑连接,外键是表中的字段,该字段连接到另一个表的主键数据。关系型数据库管理系统通常采用SQL或结构化查询语言来手机报告数据和交互式查询。

How is Data in a Relational Database System Organized?

The relational model of the relational database separates logical data structures from physical storage structures, enabling database administrators to manage physical data storage without affecting access to that data as a logical structure. The distinction also applies to database operations – logical operations allow an application to specify the content it needs, and physical operations determine how that data should be accessed, then carries out the task.

关系数据库的关系模型将逻辑数据结构与物理存储结构分开,使数据库管理员可以管理物理数据存储而不会影响对作为逻辑结构的数据的访问。逻辑操作允许应用程序指定其所需内容,而物理操作则确定应如何访问数据然后执行任务。

What are the Advantages of a Relational Database?

The main advantage of a relational database is its formally described, tabular structure, from which data can be easily stored, categorized, queried, and filtered without needing to reorganize database tables. Further benefits of relational databases include:

关系数据库的主要优点是它的形式化描述的表格结构,可以从中轻松存储,分类,查询和过滤数据,而无需重新组织数据库表。关系数据库的其他好处包括:

What is a Relational Database Management System?

A Relational Database Management System is a tabular based collection of programs and capabilities that provides an interface between users and applications and the database, offering a systematic way to create, update, delete, manage, and retrieve data. Most relational database management systems use the SQL programming language to access the database and many follow the ACID (Atomicity, Consistency, Isolation, Durability) properties of the database:

关系数据库管理系统是基于表格的程序和功能的集合,可在用户和应用程序与数据库之间提供接口,从而提供了一种创建,更新,删除,管理和检索数据的系统方法。大多数关系数据库管理系统使用SQL编程语言来访问数据库,并且许多遵循数据库的ACID属性。

非关系型数据库

非关系数据库以非表格形式存储数据,并且比传统的基于SQL的关系数据库结构更灵活。它不遵循传统关系数据库管理系统提供的关系模型。

Non-relational databases (often called NoSQL databases) are different from traditional relational databases in that they store their data in a non-tabular form. Instead, non-relational databases might be based on data structures like documents. A document can be highly detailed while containing a range of different types of information in different formats. This ability to digest and organize various types of information side-by-side makes non-relational databases much more flexible than relational databases.

Non-relational databases are often used when large quantities of complex and diverse data need to be organized. For example, a large store might have a database in which each customer has their own document containing all of their information, from name and address to order history and credit card information. Despite their differing formats, each of these pieces of information can be stored in the same document.

Non-relational databases often perform faster because a query doesn’t have to view several tables in order to deliver an answer, as relational datasets often do. Non-relational databases are therefore ideal for storing data that may be changed frequently or for applications that handle many different kinds of data. They can support rapidly developing applications requiring a dynamic database able to change quickly and to accommodate large amounts of complex, unstructured data.

非关系数据库与传统关系数据库的不同之处在于,它们以非表格形式存储数据。相反,非关系数据库可能基于文档之类的数据结构。文档可以高度详细,同时包含各种格式的不同类型的信息。并排消化和组织各种类型信息的能力使非关系数据库比关系数据库更加灵活。

当需要组织大量复杂多样的数据时,通常使用非关系数据库。非关系数据库的执行速度通常更快,这是因为查询时不必像提供关系数据集那样查看多个表。因此,非关系数据库是存储可能经常更改的数据或处理许多不同类型数据的应用程序的理想选择。它们可以支持快速开发的应用程序,这些应用程序,这些应用程序要求动态数据库能够快速更改并容纳大量复杂的非结构化数据。

A NoSQL (originally referring to “non-SQL” or “non-relational”) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed since the late 1960s, but the name “NoSQL” was only coined in the early 21st century,triggered by the needs of Web 2.0 companies.NoSQL databases are increasingly used in big data and real-time web applications.NoSQL systems are also sometimes called “Not only SQL” to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

Motivations for this approach include: simplicity of design, simpler “horizontal” scaling to clusters of machines (which is a problem for relational databases), finer control over availability and limiting the object-relational impedance mismatch. The data structures used by NoSQL databases (e.g. key–value pair, wide column, graph, or document) are different from those used by default in relational databases, making some operations faster in NoSQL. The particular suitability of a given NoSQL database depends on the problem it must solve. Sometimes the data structures used by NoSQL databases are also viewed as “more flexible” than relational database tables.

The Benifits of a non-relational database

Today’s applications collect and store increasingly vast quantities of ever-more complex customer and user data. The benefits of this data to businesses, of course, lies in its potential for analysis. Using a non-relational database can unlock patterns and value even within masses of variegated data.

There are several advantages to using non-relational databases, including:

当今的应用程序收集和存储越来越多的客户和越来越复杂的用户数据。使用非关系数据库可以在大量杂色数据中解锁模式和价值。

Non-relational databases and application development

Applications must be able to query data efficiently and deliver results almost instantly. Non-relational databases are a natural choice for this kind of environment. They offer both security and agility, allowing for rapid development of applications in an agile environment. Easier and less complex to manage than relational databases, they can also yield lower data management costs while providing superior performance and speed.

Naturals for agile development, non-relational databases can accommodate the complexity of data inputs more efficiently than structured databases. In an age of increasing data complexity, non-relational databases provide the flexibility in database design that has become increasingly indispensable. Especially when paired with the cloud, non-relational databases lift the limits on your data collection, organization, and analysis, allowing you to get the most out of your data.

应用程序必须能够有效查询数据并几乎立即交付结果。对于这种环境,非关系数据库是自然的选择。它们同时提供安全性和敏捷性,从而允许在敏捷环境中快速开发应用程序。与关系数据库相比,它们更易于管理,管理起来也不那么复杂,它们还可以降低数据管理成本,同时提供卓越的性能和速度。

与结构化数据库相比,非关系数据库对于敏捷开发是很自然的,它可以更有效地适应数据输入的复杂性。在数据复杂性日益增长的时代,非关系型数据库提供了越来越不可缺少的数据库设计灵活性。尤其是与云配对时,非关系型数据库将解除对数据收集,组织和分析的限制,使您能够充分利用数据。

Types of NoSQL Databases

Over time, four major types of NoSQL databases emerged: document databases, key-value databases, wide-column stores, and graph databases. Let’s examine each type.

Data modelPerformanceScalabilityFlexibilityComplexityFunctionality
Key–value storehighhighhighnonevariable (none)
Column-oriented storehighhighmoderatelowminimal
Document-oriented storehighvariable (high)highlowvariable (low)
Graph databasevariablevariablehighhighgraph theory
Relational databasevariablevariablelowmoderaterelational algebra

二者区别与比较

Non Relational Databases, or NoSQL databases, store and organize data in means other than the tabular relations model used in relational databases. Where relational databases store data in rows and columns, have strict rules concerning data variety and table relationships, and follow strict ACID properties, non relational databases offer a more flexible data structure based on the BASE (Basically Available, Soft state, Eventual consistency) model: Basically Available guarantees the availability of the data - there will be a response to any request, but without any consistency guarantee; Soft State guarantees that the state of the system could change over time; and Eventual Consistency guarantees that the system will eventually become consistent once it stops receiving inputs.

非关系数据库以与关系数据库中使用的表格关系模型不同的方式存储和组织数据。关系数据库在行和列中存储数据,具有关于数据种类和表关系的严格规则并遵循严格的ACID属性的情况下,非关系数据库基于BASE(基本可用,软状态,最终一致性)模型提供更灵活的数据结构:基本上可用保证数据的可用性——将对任何请求作出响应,但不提供任何一致性保证;软状态保证系统的状态可用随时间变化;最终一致性可确保系统一旦停止接收输入,最终将变得一致。

关系型数据库适合存储结构化数据,如用户的帐号、地址:

  1. 这些数据通常需要做结构化查询(嗯,好像是废话),比如join,这时候,关系型数据库就要胜出一筹
  2. 这些数据的规模、增长的速度通常是可以预期的
  3. 事务性、一致性

NoSQL适合存储非结构化数据,如文章、评论:

  1. 这些数据通常用于模糊处理,如全文搜索、机器学习
  2. 这些数据是海量的,而且增长的速度是难以预期的
  3. 根据数据的特点,NoSQL数据库通常具有无限(至少接近)伸缩性按
  4. key获取数据效率很高,但是对join或其他结构化查询的支持就比较差

CAP定理

对于一个分布式计算系统,不可能同时满足以下三点:

CAP theorem

Many relational database systems support built-in replication features where copies of the primary database can be made to other secondary server instances. Write operations are made to the primary instance and replicated to each of the secondaries. Upon a failure, the primary instance can fail over to a secondary to provide high availability. Secondaries can also be used to distribute read operations. While writes operations always go against the primary replica, read operations can be routed to any of the secondaries to reduce system load.

Data can also be horizontally partitioned across multiple nodes, such as with sharding. But, sharding dramatically increases operational overhead by spitting data across many pieces that cannot easily communicate. It can be costly and time consuming to manage. It can end up impacting performance, table joins, and referential integrity.

If data replicas were to lose network connectivity in a “highly consistent” relational database cluster, you wouldn’t be able to write to the database. The system would reject the write operation as it can’t replicate that change to the other data replica. Every data replica has to update before the transaction can complete.

关系数据库通常提供一致性和可用性,但不提供分区容限。通常将它们配置到单个服务器,并通过向计算机添加更多资源来垂直扩展。许多关系数据库系统支持内置的复制功能,可以在其中将主数据库的副本复制到其他辅助服务器实例。对主实例进行写操作,然后将其复制到每个辅助实例。发生故障时,主实例可以故障转移到辅助实例以提高高可用性。次要对象还可以用于分发读取操作。虽然写操作始终与主副本相反,但可以将读操作路由到任何辅助副本,以减少系统负载。数据也可以在多个节点之间进行水平分区,例如使用分片。但是分片会将数据分散在许多不易通信的数据块上,从而大大增加了操作开销。管理起来可能既昂贵又费时。最终可能会影响性能,表链接和参照完整性。如果数据副本在高度一致的关系数据库群集中失去网络连接,则将无法写入数据库。系统将拒绝写操作,因为它无法将更改复制到其他数据副本。每个数据副本都必须先更新,然后事务才能完成。

NoSQL databases typically support high availability and partition tolerance. They scale out horizontally, often across commodity servers. This approach provides tremendous availability, both within and across geographical regions at a reduced cost. You partition and replicate data across these machines, or nodes, providing redundancy and fault tolerance. The downside is consistency. A change to data on one NoSQL node can take some time to propagate to other nodes. Typically, a NoSQL database node will provide an immediate response to a query - even if the data that is presented is stale and hasn’t updated yet.

If data replicas were to lose connectivity in a “highly available” NoSQL database cluster, you could still complete a write operation to the database. The database cluster would allow the write operation and update each data replica as it becomes available.

This kind of result is known as eventual consistency, a characteristic of distributed data systems where ACID transactions aren’t supported. It’s a brief delay between the update of a data item and time that it takes to propagate that update to each of the replica nodes. Under normal conditions, the lag is typically short, but can increase when problems arise. For example, what would happen if you were to update a product item in a NoSQL database in the United States and query that same data item from a replica node in Europe? You would receive the earlier product information, until the cluster updates the European node with the product change. By immediately returning a query result and not waiting for all replica nodes to update, you gain enormous scale and volume, but with the possibility of presenting older data.

NoSQL数据库通常支持高可用性和分区容限。它们通常在商品服务器之间横向扩展。这种方法以低成本在地理区域内核地理区域内提供了巨大的可用性。可以在这些计算机或节点之间分区和复制数据,从而提供冗余和容错能力。缺点是一致性。一个NoSQL节点上的数据更改可能需要一些时间才能传播到其他节点。通常,NoSQL数据库节点将提供对查询的立即响应——即使所提供的的数据过时且未更新。

如果数据副本将在高度可用的NoSQL数据库群集中失去连接,则仍可用完成对数据库的写操作。数据库集群将允许写操作,并在每个数据副本可用时更新它们。

这种结果称为最终一致性,这是不支持ACID事务的分布式数据系统的特征。在数据项的更新与将更新传播到每个副本节点所花费的时间之间存在短暂的延迟。在正常情况下,滞后通常很短,但是当出现问题时滞后会增加。例如,如果要在美国的NoSQL数据库中更新产品项并从欧洲的副本节点查询相同的数据项,会发生什么情况?将收到较早的产品信息,直到集群使用产品更改更新欧洲节点。通过立即返回查询结果而不等待所有副本节点更新,将获得巨大的规模和数量,但有可能预示较旧的数据。

Consider a NoSQL datastore when:Consider a relational database when:
You have high volume workloads that require large scaleYour workload volume is consistent and requires medium to large scale
Your workloads don’t require ACID guaranteesACID guarantees are required
Your data is dynamic and frequently changesYour data is predictable and highly structured
Data can be expressed without relationshipsData is best expressed relationally
You need fast writes and write safety isn’t criticalWrite safety is a requirement
Data retrieval is simple and tends to be flatYou work with complex queries and reports
Your data requires a wide geographic distributionYour users are more centralized
Your application will be deployed to commodity hardware, such as with public cloudsYour application will be deployed to large, high-end hardware

典型产品

SQL Server

由Microsoft开发的关系数据库管理系统。作为数据库服务器,是一种软件产品,其主要功能是根据其他软件应用程序的请求来存储和检索数据,这些软件可以在同一台计算机上运行,也可以在网络上的另一台计算机上运行。

MongoDB

MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Server Side Public License (SSPL).

MongoDB是一个跨平台的面向文档的数据库,具有可伸缩性和灵活性,可用于所需的查询和索引编制。

高性能、易部署、易使用、存储数据非常方便

参考资料

[1] https://www.omnisci.com/technical-glossary/relational-database

[2] https://en.wikipedia.org/wiki/Relational_database

[3] https://www.mongodb.com/non-relational-database

[4] https://www.mongodb.com/nosql-explained

[5] https://en.wikipedia.org/wiki/NoSQL

[6] https://docs.microsoft.com/en-us/dotnet/architecture/cloud-native/relational-vs-nosql-data

[7] https://en.wikipedia.org/wiki/CAP_theorem

[8] https://en.wikipedia.org/wiki/Microsoft_SQL_Server

[9] https://www.microsoft.com/en-us/sql-server/sql-server-2019-features

[10] https://www.mongodb.com/what-is-mongodb

标签:database,data,数据库,第二周,研讨,databases,relational,数据
来源: https://blog.csdn.net/Smoothie_/article/details/112309965