open-courses
  • 公开课笔记
  • CMU 15-445/645 Database Systems
    • Relational Data Model
    • Advanced SQL
    • Database Storage
    • Buffer Pools
    • Hash Tables
    • Tree Indexes
    • Index Concurrency Control
    • Query Processing
    • Sorting&Aggregations
    • Join Algorithms
    • Query Optimization
    • Parallel Execution
    • Embedded Database Logic
    • Concurrency Control Theory
    • Two Phase Locking
    • Timestamp Ordering Concurrency Control
    • Multi-Version Concurrency Control
    • Logging Schemes
    • Database Recovery
    • Introduction to Distributed Databases
    • Distributed OLTP Databases
    • Distributed OLAP Databases
  • UCB - CS162
    • OS intro
    • Introduction to the Process
    • Processes, Fork, I/O, Files
    • I/O Continued, Sockets, Networking
    • Concurrency: Processes & Threads
    • Cooperating Threads, Synchronization
    • Semaphores, Condition Variables, Readers/Writers
    • Scheduling
    • Resource Contention & Deadlock
    • Address Translation, Caching
    • File System (18,19,20)
    • Distributed Systems, Networking, TCP/IP, RPC (21,22)
    • Distributed Storage, Key-Value Stores, Security (23)
    • Security & Cloud Computing (24)
    • Topic: Ensuring Data Reaches Disk
  • MIT - 6.006
    • Sequence and Set Interface
    • Data Structure for Dynamic Sequence Interface
    • Computation Complexity
    • Algorithms and Computation
    • Structure Of Computation
    • Graph & Search
    • Tree & Search
    • Weighted Shortest Paths
    • String Matching, Karp-Rabin
    • Priority Queue Interface & Implementation
    • Dictionary Problem & Implementation
    • Sorting
    • Dynamic Programming
    • Backtracking
    • Self-Balancing Tree
  • MIT - 6.824
    • 2PC & 3PC
    • Introduction and MapReduce
    • RPC and Threads
    • Primary/Backup Replication
    • Lab: Primary/Backup Key/Value Service
    • Google File System (GFS)
    • Raft
    • Lab: Raft - Leader Election
    • Lab: Raft - Log Replication
  • Stanford-CS107
    • 原始数据类型及相互转化
    • 指鹿为马
    • 泛型函数
    • 泛型栈
    • 运行时内存结构
    • 从 C 到汇编
    • 函数的活动记录
    • C 与 C++ 代码生成
    • 编译的预处理过程
    • 编译的链接过程
    • 函数的活动记录续、并发
    • 从顺序到并发和并行
    • 信号量与多线程 1
    • 信号量与多线程 2
    • 复杂多线程问题
    • 函数式编程 - Scheme 1
    • 函数式编程 - Scheme 2
    • 函数式编程 - Scheme 3
    • 函数式编程 - Scheme 4
    • 函数式编程 - Scheme 5
    • Python 基础
  • MIT - 6.001 - SICP
    • 什么是程序
    • 程序抽象
    • 替代模型
    • 时间/空间复杂度
    • 数据抽象
    • 高阶函数
    • Symbol
    • 数据驱动编程与防御式编程
    • 数据抽象中的效率与可读性
    • 数据修改
    • 环境模型
    • 面向对象-消息传递
    • 面向对象 - Scheme 实现
    • 构建 Scheme 解释器
    • Eval-Apply Loop
    • Normal Order (Lazy) Evaluation
    • 通用机
    • 寄存器机器
    • 子程序、栈与递归
    • 在寄存器机器中执行
    • 内存管理
  • MIT - 6.046
    • Randomized Algorithms
    • Skip Lists
  • System Design
    • Twitter
    • Cache Consistency & Coherence
  • DDIA 笔记
    • Replication
    • Transactions
    • The Trouble with Distributed Systems
    • Consistency & Consensus
  • Papers We Love
    • Consistent Hashing and Random Trees (1997)
    • Dynamic Hash Tables (1988)
    • LFU Implementation With O(1) Complexity (2010)
    • Time, Clocks, and the Ordering of Events in a Distributed System (1978)
    • Dapper, a Large-Scale Distributed Systems Tracing Infrastructure (2010)
    • Gorilla: A Fast, Scalable, In-Memory Time Series Database (2015)
  • Release It 笔记
    • Anti-patterns & Patterns in Microservice Architecture
  • Database Design
    • Log Structured Merge (LSM) Tree & Usages in KV Stores
    • Prometheus
Powered by GitBook
On this page
  • 回顾:如何解决一个算法问题
  • 问题及解决方法举例
  • Contains
  • Max Subarray Sum
  1. MIT - 6.006

Structure Of Computation

PreviousAlgorithms and ComputationNextGraph & Search

Last updated 6 years ago

回顾:如何解决一个算法问题

  • Reduce:将问题转化成你熟悉的数据结构及算法

  • Design:根据计算图设计某种递归/迭代算法。不同计算图所采用的解法不同,解法可以分为以下几个类别:Brute Force、Decrease & Conquer、Divide & Conquer、Dynamic Programming 和 Greedy / Incremental

问题及解决方法举例

Contains

  • Problem:给定一个包含 n 个整数的数组 A,判断它是否包含整数 v

  • Example:

    • input: A=[1,2,3,4,5],v=3A = [1, 2, 3, 4, 5], v = 3A=[1,2,3,4,5],v=3

    • output: true

Brute Force O(n)O(n)O(n)

def contains(A, v):
    for i in range(len(A)):
        if A[i] == v: return True
    return False

Decrease & Conquer O(n)O(n)O(n)

def contains(A, v, i=0):
    if i >= len(A) or i < 0:
        return False
    if A[i] == v:
        return True
    return contains(A, v, i+1)
def contains(A, v, i=0, j=None):
    if j is None:
        j = len(A)
    if j - i == 0:
        return False
    c = (i+j) // 2
    if A[c] == v:
        return True
    return contains(A, v, i, c) or contains(A, v, c+1, j)
def contains(A, v, i=0, j=None):
    if j is None:
        j = len(A)
    if j - i == 0:
        return False
    c = (i+j) // 2
    if A[c] == v:
        return True
    elif A[c] > v:
        return contains(A, v, i, c)
    else:
        return contains(A, v, c+1, j)

Max Subarray Sum

  • Problem: 给定一个长度为 n 的整数数组 A,找到和最大的子数组

  • Example:

    • output: 16

遍历所有可能的子数组:

def MSS(A):
    m = A[0]
    for j in range(1, len(A)+1):
        for i in range(0, j):
            m = max(m, SS(A, i, j))
    return m

def SS(A, i, j):
    s = 0
    for k in range(i, j):
        s += A[k]
    return s
  • 最大子数组有三种可能:整体位于左半边、整体位于右半边,横跨左右半边

  • 计算图为树形

def MSS(A, i=0, j=None):
    if j is None:
        j = len(A)
    if j - i == 1:
        return A[i]
    c = (i+j)//2
    return max(MSS(A, i, c), 
               MSS(A, c, j),
               MSS_EA(A, i, c) + MSS_SA(A, c, j))
  
def MSS_SA(A, i, j):
    s = m = A[i]
    for k in range(1, j-i):
        s += A[i+k]
        m = max(s, m)
    return m

def MSS_EA(A, i, j):
    s = m = A[j-1]
    for k in range(1, j-i):
        s += A[j-1-k]
        m = max(s, m)
    return m

Dynamic Programming

def MSS(A):
    m = A[0]
    for k in range(len(A)):
        s = MSS_EA(A, 0, k+1)
        m = max(m, s)
    return m

现在问题变得更容易了,图 B 中只有 n 个节点,且是一个有向无环图(DAG),恰好该图的拓扑排序正好是从 0 到 n,因此可以得到如下算法:

def MSS(A):
    m = mss_ea = A[0]
    for i in range(1, len(A)):
        mss_ea = A[i] + max(0, mss_ea)
        m = max(m, mss_ea)
    return m

Divide & Conquer O(n)O(n)O(n)

Greedy, if sorted A O(logn)O(logn)O(logn)

Input: A=[−9,1,−5,4,3,−6,7,8,−2]A = [-9, 1, -5, 4, 3, -6, 7, 8, -2]A=[−9,1,−5,4,3,−6,7,8,−2]

Brute Force O(n3)O(n^3)O(n3)

长度 = 1 的子数组 (n1){n}\choose{1}(1n​) ,长度 > 1 的子数组 (n2){n}\choose{2}(2n​) ,总数为 O(n2)O(n^2)O(n2)

对长度为 k 的子数组求和,时间复杂度为 O(k)O(k)O(k)

总体时间复杂度为 O(n3)O(n^3)O(n3)

Divide & Conquer O(nlogn)O(nlogn)O(nlogn)

T(n)=2T(n/2)+O(n)=>T(n)=O(nlogn)T(n) = 2T(n/2) + O(n) => T(n) = O(nlogn)T(n)=2T(n/2)+O(n)=>T(n)=O(nlogn)

如果MSS_EA(A,0,k)MSS\_EA(A, 0, k)MSS_EA(A,0,k) 表示 A 中以第 k 个元素结尾的子数组,以下简称 (A,0,k)(A, 0, k)(A,0,k) ,问题就按如下方式解决:

但 MSS_EA 的时间复杂度为 O(n2)O(n^2)O(n2) ,总体时间复杂度就与暴力解法如出一辙。但我们仔细思考,可以发现该过程中存在着重复计算:

如图 1 所示,在计算和最大的子数组过程中,重复计算了 n 次 (A,0,0)(A, 0, 0)(A,0,0) ,n-1 次 (A,0,1)(A, 0, 1)(A,0,1),...重新整理一下计算图,可以得到:

图 1 - 计算图 A
图 2 - 计算图 B